🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 167 (from laksa194)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄
INDEXABLE
CRAWLED
9 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.3 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://anandology.com/python-for-bioinformatics/seaborn.html
Last Crawled2026-04-03 22:12:13 (9 days ago)
First Indexed2025-03-14 00:20:25 (1 year ago)
HTTP Status Code200
Meta TitlePython for Bioinformatics - 3 Advaned Visualizations using Seaborn
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
While matplotlib provides simple visualiation charts that are easy to generate, the seaborn library provide more sophiticated charts often handly to present complex data from the bioinformatics. Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Seaborn has three categories of charts - relplot (relational), displot (distributions) and catplot (categories). Searborn organization From Seaborn Tutorial . Palmer Penguins Palmer Penguins In this section, we’ll use Palmer Penguins dataset , which is comes packaged with seaborn. The dataset includes measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. import seaborn as sns df = sns.load_dataset( "penguins" ) df.head() species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male 1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female 2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female 3 Adelie Torgersen NaN NaN NaN NaN NaN 4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female df.shape (344, 7) While seaborn can ignore missing values when plotting, it may be good idea in general to clean the data to remove missing values before starting any exploration. # number of missing values in each column df.isna(). sum () species 0 island 0 bill_length_mm 2 bill_depth_mm 2 flipper_length_mm 2 body_mass_g 2 sex 11 dtype: int64 # drop the rows with missing value df.dropna(inplace = True ) # number of rows and columns in the data after dropping rows with missing values df.shape (333, 7) Scatterplot df.head() species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male 1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female 2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female 4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female 5 Adelie Torgersen 39.3 20.6 190.0 3650.0 Male The scatter plot allows visualizing two dimentions. More dimentions can be added a scatter plot to control color, size and style. # scatterplot of bill length vs bill depth sns.scatterplot(df, x = "bill_length_mm" , y = "bill_depth_mm" ) <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> The scatterplot is a special kind of relplot . We can get the same output using the following ways as well. When we specify both x and y arguments to sns.relplot(df, x="bill_length_mm", y="bill_depth_mm") # scatterplot of bill length vs bill depth # with color by the species sns.scatterplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "species" ) <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> # scatterplot of bill length vs bill depth # with color by the species and style by sex sns.scatterplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "species" , style = "sex" ) <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> # scatterplot of bill length vs bill depth # with color by the species, style by sex and size by body weight sns.scatterplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "species" , style = "sex" , size = "body_mass_g" ) <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> Seaborn allows drawing lines on the graphs using matplotlib primitives. g = sns.relplot(data = df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = 'species' ) # draw a line using start and end points g.ax.axline(xy1 = ( 30 , 13 ), xy2 = ( 60 , 19 ), color = "g" , dashes = ( 5 , 2 )) # draw a line using start point and slope g.ax.axline(xy1 = ( 35 , 13 ), slope = .6 , color = "r" , dashes = ( 5 , 2 )) <matplotlib.lines._AxLine at 0x7f6f9e001f30> Distributions sns.displot(df, x = "flipper_length_mm" , kind = "hist" ) # sns.histplot(df, x="flipper_length_mm") sns.displot(df, x = "flipper_length_mm" , kind = "kde" ) # sns.kdetplot(df, x="flipper_length_mm") The distplots allow grouping by color. sns.displot(df, x = "flipper_length_mm" , kind = "kde" , hue = "species" ) We can stack multiple distributions on top of each other. sns.displot(df, x = "flipper_length_mm" , kind = "kde" , hue = "species" , multiple = "stack" ) We could do the same with histograms. sns.displot(df, x = "flipper_length_mm" , kind = "hist" , hue = "species" , multiple = "stack" ) Categorical plots Catplots allows visualizaing categorical data. The default view is a scatter plot with a small jitter added to make the points visible. sns.catplot(df, x = "species" , y = "bill_length_mm" ) A slightly better looking version of that is a swarm plot. sns.catplot(df, x = "species" , y = "bill_length_mm" , kind = "swarm" ) we can add another dimension usng hue. sns.catplot(df, x = "species" , y = "bill_length_mm" , kind = "swarm" , hue = "sex" ) We could even flip the axes, if we want. sns.catplot(df, x = "bill_length_mm" , y = "species" , kind = "swarm" , hue = "sex" ) Comparing Distributions The boxplot and voilinplot , kinds of catplots, allows comparing distributions. sns.boxplot(df, y = 'bill_length_mm' ) <Axes: ylabel='bill_length_mm'> sns.violinplot(df, y = 'bill_length_mm' ) <Axes: ylabel='bill_length_mm'> Both these plots allow splitting the distribution by a categorical column. sns.violinplot(df, y = 'bill_length_mm' , x = "species" ) <Axes: xlabel='species', ylabel='bill_length_mm'> We could add another dimension using hue. sns.violinplot(df, y = 'bill_length_mm' , x = "species" , hue = "sex" ) <Axes: xlabel='species', ylabel='bill_length_mm'> We could use the space better by splitting the violin when there are only two categories. sns.violinplot(df, y = 'bill_length_mm' , x = "species" , hue = "sex" , split = True ) <Axes: xlabel='species', ylabel='bill_length_mm'> Combining multiple views on the data The jointplot and pairplot plots both relationships and distubutions in a single graph. sns.jointplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , height = 3 ) sns.jointplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "species" , height = 4 ) The pairplot show relations between all the numerical columns in a single grid. sns.pairplot(data = df, hue = "species" ) Showing multiple charts Seaborn allows showing a grip of charts for displaying more information. sns.relplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , col = "species" , height = 3 ) sns.relplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "sex" , col = "species" , height = 3 ) sns.relplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , col = "species" , row = "sex" , height = 3 ) When there are too many categories, we can even specify col_wrap . sns.relplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "sex" , col = "species" , col_wrap = 2 , height = 3 ) This functionality is similar to Facetwrap in R. Multiple Charts in a grid import matplotlib.pyplot as plt f, axs = plt.subplots( 1 , 2 , figsize = ( 8 , 3 )) sns.scatterplot(df, x = "bill_length_mm" , y = "bill_depth_mm" , hue = "species" , ax = axs[ 0 ]) sns.histplot(df, x = "species" , hue = "species" , ax = axs[ 1 ]) <Axes: xlabel='species', ylabel='Count'>
Markdown
1. [3 Advaned Visualizations using Seaborn](https://anandology.com/python-for-bioinformatics/seaborn.html) [Python for Bioinformatics](https://anandology.com/python-for-bioinformatics/) - [Preface](https://anandology.com/python-for-bioinformatics/index.html) - [1 A Taste of Python](https://anandology.com/python-for-bioinformatics/taste-of-python.html) - [2 Introduction to Data Analysis and Visualization](https://anandology.com/python-for-bioinformatics/introduction-to-data-analysis.html) - [3 Advaned Visualizations using Seaborn](https://anandology.com/python-for-bioinformatics/seaborn.html) - [4 Interactive Visualizations using Plotly](https://anandology.com/python-for-bioinformatics/plotly.html) - [5 Interfacing R from Python](https://anandology.com/python-for-bioinformatics/rpy2.html) - [References](https://anandology.com/python-for-bioinformatics/references.html) - [Cookbook](https://anandology.com/python-for-bioinformatics/cookbook/index.html) - [6 Making Pie Charts](https://anandology.com/python-for-bioinformatics/cookbook/piechart.html) - [7 Extract GSE data from NCBI database](https://anandology.com/python-for-bioinformatics/cookbook/geoparse.html) - [8 Regular Expressions](https://anandology.com/python-for-bioinformatics/cookbook/regular-expressions.html) ## Table of contents - [3\.1 Seaborn](https://anandology.com/python-for-bioinformatics/seaborn.html#seaborn) - [3\.1.1 Palmer Penguins](https://anandology.com/python-for-bioinformatics/seaborn.html#palmer-penguins) - [3\.1.2 Scatterplot](https://anandology.com/python-for-bioinformatics/seaborn.html#scatterplot) - [3\.2 Distributions](https://anandology.com/python-for-bioinformatics/seaborn.html#distributions) - [3\.3 Categorical plots](https://anandology.com/python-for-bioinformatics/seaborn.html#categorical-plots) - [3\.4 Comparing Distributions](https://anandology.com/python-for-bioinformatics/seaborn.html#comparing-distributions) - [3\.4.1 Combining multiple views on the data](https://anandology.com/python-for-bioinformatics/seaborn.html#combining-multiple-views-on-the-data) - [3\.4.2 Showing multiple charts](https://anandology.com/python-for-bioinformatics/seaborn.html#showing-multiple-charts) - [3\.4.3 Multiple Charts in a grid](https://anandology.com/python-for-bioinformatics/seaborn.html#multiple-charts-in-a-grid) [Edit this page](https://github.dev/anandology/python-for-bioinformatics/blob/main/seaborn.ipynb) [Report an issue](https://github.com/anandology/python-for-bioinformatics/issues/new) # 3 Advaned Visualizations using Seaborn While matplotlib provides simple visualiation charts that are easy to generate, the `seaborn` library provide more sophiticated charts often handly to present complex data from the bioinformatics. ## 3\.1 Seaborn Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Seaborn has three categories of charts - *relplot* (relational), *displot* (distributions) and *catplot* (categories). ![](https://anandology.com/python-for-bioinformatics/images/seaborn-organization.png) Searborn organization From [Seaborn Tutorial](https://seaborn.pydata.org/tutorial/function_overview.html). ### 3\.1.1 Palmer Penguins ![](https://anandology.com/python-for-bioinformatics/images/lter_penguins2.png) Palmer Penguins In this section, we’ll use [Palmer Penguins dataset](https://github.com/allisonhorst/palmerpenguins), which is comes packaged with seaborn. The dataset includes measurements for penguin species, island in Palmer Archipelago, size (flipper length, body mass, bill dimensions), and sex. ``` import seaborn as sns ``` ``` df = sns.load_dataset("penguins") ``` ``` df.head() ``` | | species | island | bill\_length\_mm | bill\_depth\_mm | flipper\_length\_mm | body\_mass\_g | sex | |---|---|---|---|---|---|---|---| | 0 | Adelie | Torgersen | 39\.1 | 18\.7 | 181\.0 | 3750\.0 | Male | | 1 | Adelie | Torgersen | 39\.5 | 17\.4 | 186\.0 | 3800\.0 | Female | | 2 | Adelie | Torgersen | 40\.3 | 18\.0 | 195\.0 | 3250\.0 | Female | | 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN | | 4 | Adelie | Torgersen | 36\.7 | 19\.3 | 193\.0 | 3450\.0 | Female | ``` df.shape ``` ``` (344, 7) ``` While seaborn can ignore missing values when plotting, it may be good idea in general to clean the data to remove missing values before starting any exploration. ``` ``` ``` species 0 island 0 bill_length_mm 2 bill_depth_mm 2 flipper_length_mm 2 body_mass_g 2 sex 11 dtype: int64 ``` ``` ``` ``` ``` ``` (333, 7) ``` ### 3\.1.2 Scatterplot ``` df.head() ``` | | species | island | bill\_length\_mm | bill\_depth\_mm | flipper\_length\_mm | body\_mass\_g | sex | |---|---|---|---|---|---|---|---| | 0 | Adelie | Torgersen | 39\.1 | 18\.7 | 181\.0 | 3750\.0 | Male | | 1 | Adelie | Torgersen | 39\.5 | 17\.4 | 186\.0 | 3800\.0 | Female | | 2 | Adelie | Torgersen | 40\.3 | 18\.0 | 195\.0 | 3250\.0 | Female | | 4 | Adelie | Torgersen | 36\.7 | 19\.3 | 193\.0 | 3450\.0 | Female | | 5 | Adelie | Torgersen | 39\.3 | 20\.6 | 190\.0 | 3650\.0 | Male | The scatter plot allows visualizing two dimentions. More dimentions can be added a scatter plot to control color, size and style. ``` ``` ``` <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-11-output-2.png) The *scatterplot* is a special kind of *relplot*. We can get the same output using the following ways as well. When we specify both `x` and `y` arguments to ``` sns.relplot(df, x="bill_length_mm", y="bill_depth_mm") ``` ``` ``` ``` <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-13-output-2.png) ``` ``` ``` <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-14-output-2.png) ``` ``` ``` <Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-15-output-2.png) Seaborn allows drawing lines on the graphs using matplotlib primitives. ``` ``` ``` <matplotlib.lines._AxLine at 0x7f6f9e001f30> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-16-output-2.png) ## 3\.2 Distributions ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-17-output-1.png) ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-18-output-1.png) The distplots allow grouping by color. ``` sns.displot(df, x="flipper_length_mm", kind="kde", hue="species") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-19-output-1.png) We can stack multiple distributions on top of each other. ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-20-output-1.png) We could do the same with histograms. ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-21-output-1.png) ## 3\.3 Categorical plots Catplots allows visualizaing categorical data. The default view is a scatter plot with a small jitter added to make the points visible. ``` sns.catplot(df, x="species", y="bill_length_mm") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-22-output-1.png) A slightly better looking version of that is a swarm plot. ``` sns.catplot(df, x="species", y="bill_length_mm", kind="swarm") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-23-output-1.png) we can add another dimension usng hue. ``` sns.catplot(df, x="species", y="bill_length_mm", kind="swarm", hue="sex") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-24-output-1.png) We could even flip the axes, if we want. ``` sns.catplot(df, x="bill_length_mm", y="species", kind="swarm", hue="sex") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-25-output-1.png) ## 3\.4 Comparing Distributions The *boxplot* and *voilinplot*, kinds of catplots, allows comparing distributions. ``` sns.boxplot(df, y='bill_length_mm') ``` ``` <Axes: ylabel='bill_length_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-26-output-2.png) ``` sns.violinplot(df, y='bill_length_mm') ``` ``` <Axes: ylabel='bill_length_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-27-output-2.png) Both these plots allow splitting the distribution by a categorical column. ``` sns.violinplot(df, y='bill_length_mm', x="species") ``` ``` <Axes: xlabel='species', ylabel='bill_length_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-28-output-2.png) We could add another dimension using hue. ``` sns.violinplot(df, y='bill_length_mm', x="species", hue="sex") ``` ``` <Axes: xlabel='species', ylabel='bill_length_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-29-output-2.png) We could use the space better by splitting the violin when there are only two categories. ``` sns.violinplot(df, y='bill_length_mm', x="species", hue="sex", split=True) ``` ``` <Axes: xlabel='species', ylabel='bill_length_mm'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-30-output-2.png) ### 3\.4.1 Combining multiple views on the data The `jointplot` and `pairplot` plots both relationships and distubutions in a single graph. ``` sns.jointplot(df, x="bill_length_mm", y="bill_depth_mm", height=3) ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-31-output-1.png) ``` sns.jointplot(df, x="bill_length_mm", y="bill_depth_mm", hue="species", height=4) ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-32-output-1.png) The pairplot show relations between all the numerical columns in a single grid. ``` sns.pairplot(data=df, hue="species") ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-33-output-1.png) ### 3\.4.2 Showing multiple charts Seaborn allows showing a grip of charts for displaying more information. ``` sns.relplot(df, x="bill_length_mm", y="bill_depth_mm", col="species", height=3) ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-34-output-1.png) ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-35-output-1.png) ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-36-output-1.png) When there are too many categories, we can even specify `col_wrap`. ``` ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-37-output-1.png) This functionality is similar to Facetwrap in R. ### 3\.4.3 Multiple Charts in a grid ``` import matplotlib.pyplot as plt ``` ``` ``` ``` <Axes: xlabel='species', ylabel='Count'> ``` ![](https://anandology.com/python-for-bioinformatics/seaborn_files/figure-html/cell-39-output-2.png) [2 Introduction to Data Analysis and Visualization](https://anandology.com/python-for-bioinformatics/introduction-to-data-analysis.html) [4 Interactive Visualizations using Plotly](https://anandology.com/python-for-bioinformatics/plotly.html) Python for Bioinformatics was written by [Anand Chitipothu](https://anandology.com/). This book was built with [Quarto](https://quarto.org/).
Readable Markdownnull
Shard167 (laksa)
Root Hash4276928819562775967
Unparsed URLcom,anandology!/python-for-bioinformatics/seaborn.html s443