Scatter
Scatterplot block.
The scatter plot is perhaps the most well-known chart to plot x, and y coordinates. Basic charts are very useful from time to time, especially with the brushing and zooming capabilities. The scatter plots can be sample-wise colored and used to detect relationships between (groups of) variables. The input data frame should contain 2 columns (x and y) with the coordinates, and the index represents the class label.
- param x:
1d coordinates x-axis.
- type x:
numpy array
- param y:
1d coordinates y-axis.
- type y:
numpy array
- param x1:
Second set of 1d coordinates x-axis.
- type x1:
numpy array
- param y1:
Second set of 1d coordinates y-axis.
- type y1:
numpy array
- param x2:
Third set of 1d coordinates x-axis.
- type x2:
numpy array
- param y2:
Third set of 1d coordinates y-axis.
- type y2:
numpy array
- param jitter:
Add jitter to data points as random normal data. Values of 0.01 is usually good for one-hot data seperation.
- type jitter:
float, default: None
- param size:
Size of the samples.
- type size:
list/array of with same size as (x,y).
- param color:
‘#ffffff’ : All dots are get the same hex color.
None: The same color as for c is applied.
[‘#000000’, ‘#ffffff’,…]: list/array of hex colors with same size as (x,y)
- type color:
list/array of hex colors with same size as (x,y)
- param stroke:
- Edgecolor of dotsize in hex colors.
‘#000000’ : All dots are get the same hex color.
[‘#000000’, ‘#ffffff’,…]: list/array of hex colors with same size as (x,y)
- type stroke:
list/array of hex colors with same size as (x,y)
- param c_gradient:
- Hex color to make a lineair gradient using the density.
None: Do not use gradient.
opaque: Towards the edges the points become more transparant. This will stress the dense areas and make scatter plot tidy.
‘#FFFFFF’: Towards the edges it smooths into this color
- type c_gradient:
String, (default: ‘opaque’)
- param opacity:
Opacity of the dot. Shoud be same size as (x,y)
- type opacity:
float or list/array [0-1]
- param tooltip:
labels of the samples.
- type tooltip:
list of labels with same size as (x,y)
- param cmap:
- All colors can be reversed with ‘_r’, e.g. ‘binary’ to ‘binary_r’
‘tab20c’, ‘Set1’, ‘Set2’, ‘rainbow’, ‘bwr’, ‘binary’, ‘seismic’, ‘Blues’, ‘Reds’, ‘Pastel1’, ‘Paired’, ‘twilight’, ‘hsv’
- type cmap:
String, (default: ‘inferno’)
- param scale:
Scale datapoints. The default is False.
- type scale:
Bool, optional
- param label_radio:
The labels used for the radiobuttons.
- type label_radio:
List [‘(x, y)’, ‘(x1, y1)’, ‘(x2, y2)’]
- param set_xlim:
Width of the x-axis: The default is extracted from the data with 10% spacing.
- type set_xlim:
tuple, (default: [None, None])
- param set_ylim:
Height of the y-axis: The default is extracted from the data with 10% spacing.
- type set_ylim:
tuple, (default: [None, None])
- param title:
- Title of the figure.
‘Scatterplot’
- type title:
String, (default: None)
- param filepath:
- File path to save the output.
Temporarily path: ‘d3blocks.html’
Relative path: ‘./d3blocks.html’
Absolute path: ‘c://temp//d3blocks.html’
None: Return HTML
- type filepath:
String, (Default: user temp directory)
- param figsize:
- Size of the figure in the browser, [width, height].
[900, 600]
- type figsize:
tuple
- param showfig:
True: Open browser-window.
False: Do not open browser-window.
- type showfig:
bool, (default: True)
- param overwrite:
True: Overwrite the html in the destination directory.
False: Do not overwrite destination file but show warning instead.
- type overwrite:
bool, (default: True)
- param notebook:
True: Use IPython to show chart in notebook.
False: Do not use IPython.
- type notebook:
bool
- param save_button:
True: Save button is shown in the HTML to save the image in svg.
False: No save button is shown in the HTML.
- type save_button:
bool, (default: True)
- param reset_properties:
True: Reset the node_properties at each run.
False: Use the d3.node_properties()
- type reset_properties:
bool, (default: True)
- returns:
d3.node_properties (DataFrame of dictionary) – Contains properties of the unique input label/nodes/samples.
d3.edge_properties (DataFrame of dictionary) – Contains properties of the unique input edges/links.
d3.config (dictionary) – Contains configuration properties.
Examples
>>> # Load d3blocks
>>> from d3blocks import D3Blocks
>>> #
>>> # Initialize
>>> d3 = D3Blocks()
>>> #
>>> # Load example data
>>> df = d3.import_example('cancer')
>>> #
>>> # Set size and tooltip
>>> size = df['survival_months'].fillna(1).values / 20
>>> tooltip = df['labx'].values + ' <br /> Survival: ' + df['survival_months'].astype(str).str[0:4].values
>>> #
>>> # Scatter plot
>>> d3.scatter(df['tsneX'].values,
df['tsneY'].values,
size=size,
color=df['labx'].values,
stroke='#000000',
opacity=0.4,
tooltip=tooltip,
filepath='scatter_demo.html',
cmap='tab20')
Examples
>>> # Scatter plot with transitions. Note that scale is set to True to make the axis comparible to each other
>>> d3.scatter(df['tsneX'].values,
df['tsneY'].values,
x1=df['PC1'].values,
y1=df['PC2'].values,
label_radio=['tSNE', 'PCA'],
scale=True,
size=size,
color=df['labx'].values,
stroke='#000000',
opacity=0.4,
tooltip=tooltip,
filepath='scatter_transitions2.html',
cmap='tab20')
Examples
>>> # Scatter plot with transitions. Note that scale is set to True to make the axis comparible to each other
>>> d3.scatter(df['tsneX'].values,
df['tsneY'].values,
x1=df['PC1'].values,
y1=df['PC2'].values,
x2=df['PC2'].values,
y2=df['PC1'].values,
label_radio=['tSNE', 'PCA', 'PCA_reverse'],
scale=True,
size=size,
color=df['labx'].values,
stroke='#000000',
opacity=0.4,
tooltip=tooltip,
filepath='scatter_transitions3.html',
cmap='tab20')
Examples
>>> # Load d3blocks
>>> from d3blocks import D3Blocks
>>> #
>>> # Initialize
>>> d3 = D3Blocks(chart='Scatter')
>>> #
>>> # Import example
>>> df = d3.import_example('cancer')
>>> #
>>> # Set properties
>>> d3.set_edge_properties(df['tsneX'].values,
df['tsneY'].values,
x1=df['PC1'].values,
y1=df['PC2'].values,
label_radio=['tSNE','PCA'],
size=df['survival_months'].fillna(1).values / 10,
color=df['labx'].values,
tooltip=df['labx'].values + ' <br /> Survival: ' + df['survival_months'].astype(str).str[0:4].values,
scale=True)
>>> #
>>> # Show the chart
>>> d3.show()
>>> #
>>> # Set specific node properties.
>>> print(d3.edge_properties)
>>> d3.edge_properties.loc[0,'size']=50
>>> d3.edge_properties.loc[0,'color']='#000000'
>>> d3.edge_properties.loc[0,'tooltip']='I am adjusted!'
>>> #
>>> # Configuration can be changed too.
>>> print(d3.config)
>>> #
>>> # Show the chart again with adjustments
>>> d3.show()
References
Input Data
The input dataset are the x-coordinates and y-coordinates that needs to be specified seperately.
# x y age ... labels
# labels ...
# acc 37.204296 24.162813 58.0 ... acc
# acc 37.093090 23.423557 44.0 ... acc
# acc 36.806297 23.444910 23.0 ... acc
# acc 38.067886 24.411770 30.0 ... acc
# acc 36.791195 21.715324 29.0 ... acc
# ... ... ... ... ...
# brca 0.839383 -8.870781 NaN ... brca
# brca -5.842904 2.877595 NaN ... brca
# brca -9.392038 1.663352 71.0 ... brca
# brca -4.016389 6.260741 NaN ... brca
# brca 0.229801 -8.227086 NaN ... brca
# [4674 rows x 9 columns]