Insights from E. coli and Linux: Comparing Bio and Computer OS Networks, Slides of Biology

The concept of biological network analysis using the examples of E. coli transcriptional regulatory network and the Linux call graph. The author discusses the topology and evolution of these networks, their hierarchical organization, and the correlation between reuse and persistence. The document also highlights the differences between biological and computer operating systems networks in terms of modularity, node reuse, and robustness.

Typology: Slides

2021/2022

Uploaded on 07/05/2022

allan.dev
allan.dev 🇦🇺

4.5

(86)

1K documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Network analysis of
biological data
Gang Fang, PhD
New York University Shanghai Campus
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Insights from E. coli and Linux: Comparing Bio and Computer OS Networks and more Slides Biology in PDF only on Docsity!

Network analysis of

biological data

Gang Fang, PhD

New York University Shanghai Campus

How much data is there in biology?

How big is the cell?

How thick is the cell wall?

How long is its flagellum?

How long does the cell live?

How much “nutrition” does the cell need?

How long is the genome?

How many genes are there in the genome?

How many mRNAs inside the cell?

How many proteins inside the cell?

How many metabolites inside the cell?

How long does one mRNA “stay” in the cell?

How much “energy” does it take to synthesize protein?

Is the data constant for all time, under all circumstances?

What's the best way to describe biology?

What is biological network

  • Node

Gene, protein, metabolite, any “biological object”

  • Edge

Regulation, protein-protein interaction, any kind of “similarity” or “dissimilarity” etc.

  • “Weights” or features

Conservation of gene, expression value, half time, any measurable or categorical variable.

  • Network topology

Clusters, modularity, node centrality, shortest path etc.

  • Network dynamics

Comparison of networks: time series and environmental changes

  • Network rewiring and permutation

Test your theory!

Hierarchical organization:

pyramidal versus top-heavy

E. coli transcriptional regulatory network the Linux call graph

master regulator

workhorse

middle manager

Persistent genes^ Persistent functions

Genes subject to strong natural selective pressure Software engineers’ favorite functions

Organization of Modules:

independent versus overlap

7

E. Coli TRN

Linux call graph Average overlap 4.3% 80.7% Maximum node reuse 15.6%^ 87.5% Average node reuse

Modules are labeled by master regulators: TFs, high-level starting functions

TRN:

modules overlap little, components are less generic

M 2 ∩ M 3 M 2 ∪ M 3

= 2 11

Overlap(M2,M3)=

reuse=2/3 reuse=1/

M M2 (^) M

Call graph: modules overlap, Functions are highly reused (generic): “printk”

We observe opposite correlation behaviors in the two systems: Reuse and persistence are negatively correlated in the E. coli regulatory network but positively correlated in the Linux call graph.

[Spearman correlation r=−0.074 (P < 0.01) and r=0.10 (P < 10 −4), respectively]

Can we use network analysis to

identify protein “living fossils”?

Thank you!