Social Network Analysis (SNA), the study of the relational structure between actors, is used throughout the social and natural sciences to discover insight from connected entities. In this tutorial, you will learn how to use the NetworkX library to analyze network data in Python, emphasizing intuition over theory.
Methods will be illustrated using a dataset of the romantic relationships between characters on "Grey's Anatomy", an American medical drama on the ABC television network. Analysis and intuition will be emphasized over theory and mathematical rigor. An IPython/Jupyter notebook format will be used as we code through the examples together.
OUTLINE
- What is Social Network Analysis? A short history, motivating examples, and terminology
- Nodes
- Edges
- Adjacency
- Attributes and Weights
- Creating Graphs
- Create graph object
- Add nodes and edges
- Add attributes
- Undirected vs. Directed graphs
- Visualizing Graphs
- Draw graph with Matplotlib
- Draw graph with Graphviz
- Other visualization options -- Python and beyond (Gephi, Cytoscape, D3, etc.)
- Centrality - "Who's the Boss?"
- Definition of Centrality
- Examples (Medici family, trade example, etc.)
- Compare and contrast popular centrality measures on dataset
- Degree
- Closeness
- Betweenness
- Eigenvector
- Community - "Where do I belong?"
- Definition of Community and Community Detection
- Examples
- Perform community detection on dataset
- k-cliques or Girvan-Newman
- Link Prediction - "People you May Know"
- Definition of Link Prediction
- Examples
- Perform link prediction on dataset
- Jaccard coefficient
- Preferential Attachment
- Exporting graphs
BACKGROUND AND PRE-REQS
If you've obtained python through Anaconda, you should be set. In essence, we need Python 3 and the conda package manager.
- Anaconda
- Python 3
- Conda package manager
- Packages listed below
This is an intermediate level tutorial, so we expect prior knowledge of:
- Python types & data structures
- Installing python packages through
pip
or conda
- Data Science Packages: