0% found this document useful (0 votes)

15 views9 pages

Psosm Tutorial Notes

Uploaded by

allen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views9 pages

Psosm Tutorial Notes

Uploaded by

allen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Tutorial 1

Installing Ubuntu in VirtualBox

Introduction

This tutorial covers how to install Ubuntu Linux in VirtualBox on Windows, Mac, or Linux.
Ubuntu is a popular Linux distribution that is easy to use for beginners. VirtualBox allows you to
install Ubuntu in a virtual machine so you don't have to disturb your main operating system.

Downloading Ubuntu and VirtualBox

First download the Ubuntu ISO image from the Ubuntu website. The recommended version is
the latest Long Term Support (LTS) release. LTS versions are supported for 3-5 years.

Also download and install VirtualBox for your operating system. VirtualBox allows you to create
and run virtual machines.

Creating the Virtual Machine

Open VirtualBox and click "New" to create a new virtual machine. Give it a name like "Ubuntu
Test", select Linux and Ubuntu 64-bit.

Allocate enough RAM, ideally 2-3 GB if you have it. Create a new virtual hard disk of at least 10
GB.

Installing Ubuntu

In VirtualBox, select the Ubuntu virtual machine and click Start. When prompted, select the
Ubuntu ISO file you downloaded. This will boot the Ubuntu installer.

Follow the prompts to install Ubuntu. Choose "Normal Install" unless you have bandwidth
constraints. Partition the virtual hard drive carefully. Create a swap partition the same size as
the RAM and an "ext4" root partition with the remaining space.

The install takes 30-40 minutes. You will set a username and password which you'll need to log
in after install.

Finishing Up

Once installed, Ubuntu will reboot. Log in with the username and password you set.

You now have Ubuntu running in a virtual machine inside your existing OS! You can use the
terminal and install additional software like Python for programming tasks.

A Beginner's Guide to Git and GitHub

What is Git and why use it?

Git is a version control system that allows you to track changes to your code over time. It
allows you to experiment with new features and easily revert back if something goes wrong.
Some key benefits of using Git:

• Track code changes over time with commits

• Easily collaborate with others by sharing code on GitHub
• Experiment freely knowing you can revert back to a working version
• Work on multiple features simultaneously using branches
• Avoid needing to make full copies of code each time you want a snapshot

Without Git, you would need to make full copies of your codebase each time you wanted a
snapshot. This takes up a massive amount of storage space. Git uses diffs to only store the
changes between commits, saving storage space.

Basic Git Commands

Here are some basic Git commands to get started:

• git init - Initializes a Git repository in the current folder

• git status - Shows the status of files in the repo, untracked or modified
• git add - Adds files to the staging area to include in the next commit
• git commit - Commits staged files with a message describing the changes
• git log - Shows a history of all commits in the repository

Collaborating with GitHub

GitHub provides remote hosting for Git repositories. This allows you to share your code with
others and collaborate. Here's how to connect your local repo to a GitHub remote:

• Create a repository on GitHub

• Run git remote add origin <url> to link local repo to remote
• git push origin master pushes commits to the remote
• git pull fetches updates from the remote repo

GitHub provides a great web UI for viewing commit history, diffs, collaborating with others, and
more.

Summary

• Git is a powerful version control system that allows tracking code history, branching,
collaboration and more
• Basic commands: init, add, commit, status, log
• GitHub provides hosting for remote repositories to enable collaboration
• Following a basic Git workflow of adding/committing changes helps manage code

Tutorial 2
Week 2 Reddit Tutorial: Data Collection Overview

Introduction to Reddit

Reddit is a social media platform where users interact through comments and upvotes on
posts. It has a community feel with subgroups called subreddits focused on specific topics. For
example, r/olympics is about the Olympics. Users can search for keywords like "India" to find
relevant subreddits, posts, communities and users.

Posts have titles, text bodies, images or videos. They accrue upvotes and comments.
Comments allow layered conversations as users can reply to other comments.

Subreddits have rules, moderators, flairs (like hashtags), and related communities.

Collecting Reddit Data with PRAW

To collect Reddit data, first create a Reddit account and app. The app will provide
authentication credentials like a client ID and secret.

Use the Python library PRAW to connect to Reddit with the credentials. The subreddit method
specifies a particular subreddit to pull data from. For example, subreddit("india") collects
posts from r/india.

PRAW returns post data like title, score, ID, URL, number of comments, created time, and body
text. Store this in a Pandas dataframe for analysis.

Save collected data as CSV files to access later. The PRAW documentation explains available
data fields like comments, usernames, post metadata etc.

Key Takeaways

• Reddit has subreddits, posts, comments, upvotes for user interaction

• Use PRAW and authentication credentials to collect Reddit data
• Pull post data like titles, scores, text, metadata into Python
• Save and analyze Reddit data frames with Pandas
• PRAW documentation lists available data fields to extract

Tutorial 3

Collecting and Storing Twitter Data in MySQL Database

Introduction

This tutorial covers how to collect Twitter data from the Twitter API and store it in a MySQL
database. MySQL is an open-source relational database management system that uses SQL.

Installing MySQL
The first step is installing MySQL on your system using the sudo apt-get install mysql-
server command. You will be prompted to set a root password during installation.

Then MySQL is configured using the sudo mysql_secure_installation command. This

secures the installation by removing anonymous user accounts and remote root login.

Connecting Python to MySQL

To connect Python to MySQL, the mysql-db module needs to be installed using sudo apt-get
install python-mysql-db.

This completes the installation and integration of MySQL with the Python environment.

Creating a MySQL Table to Store Tweets

A database called osn_data is created to store the tweets. Within this database, a table
called tweets is created with tweet_id as the primary key and a text field to store the tweet
content.

Streaming Tweets from Twitter API into MySQL

The Python script that collects real-time tweets from the Twitter API is modified to insert the
tweets into the MySQL database.

The tweets are inserted using an INSERT IGNORE query with

the tweet_id and tweet_text values. This adds new entries ignoring the duplicate primary
keys.

Querying Stored Tweets from MySQL

After running the modified script for some time, the MySQL table contains multiple rows of
tweets.

The SELECT * FROM tweets query is used to retrieve the stored tweets. Other queries
like SELECT COUNT(*) FROM tweets can get the number of rows in the table.

This allows effective storage and querying of streaming Twitter data in MySQL database. The
queries can be integrated into the Python script for better data collection and analysis.

Tutorial 4

Understanding Social Media Data as Network Graphs

Introduction

Social media platforms like Facebook and Twitter generate large amounts of data that can be
analyzed as network graphs. Network graphs represent entities like users, pages, or groups as
nodes, and relationships between them as edges. In this post, we will learn how to collect and
visualize Twitter data as a network graph using tools like Twecoll and Gephi.
Representing Twitter Data as a Network Graph

There are different ways to represent a node-edge graph, including adjacency matrices,
GraphML format, and CSV files.

An adjacency matrix is a 2D square matrix with dimensions equal to the number of nodes in
the graph. A 1 in cell (i,j) indicates an edge exists between node i and node j. This is simple to
construct using arrays in a programming language, but can take up a lot of space for sparse
graphs.

GraphML is an XML-based format containing node and edge elements in sequence. Each node
element must have a unique ID attribute and each edge element has source and target
attributes identifying the endpoint nodes of an edge.

To collect Twitter data, we can use the Twecoll command line tool. It gathers information about
your Twitter followers and "friends of friends." The python twecoll commandline prompts you
to authorize a Twitter application and enter credentials. Then it retrieves follower IDs. python
twecoll edgelist adds edges between you, your followers, and their followers.

Visualizing the Twitter Graph with Gephi

Gephi is an open-source tool for graph visualization. After installing it, we can open the
GraphML file produced by Twecoll. Gephi displays summary statistics about the graph and can
generate various network metrics like degree distribution, shortest paths, and modularity to
find communities.

The graph layout can be customized by changing color, size, and shape of nodes and edges.
We set the node color based on modularity class to see community structure and size by
degree to highlight key nodes. The edge curvature can also be adjusted to improve readability.
The final graph can be exported as an image or PDF file.

Key Graph Metrics

Some key metrics used in social network analysis:

• Degree - The number of edges incident to a node. Related measures are in-degree and
out-degree.

• Centrality - Identifies important or central nodes. Common centrality measures are

degree, closeness (mean shortest path length), and betweenness centrality.

• Modularity - Quantifies the strength of communities, based on the fraction of edges

inside a given group.

• Data Laboratory - In Gephi, view the nodes, edges, and their attributes as originally
stored in the GraphML file.

Conclusion

This summarizes the key steps covered in the video to collect Twitter data, represent it as a
network graph, analyze graph metrics using Gephi, and customize the graph visualization.
Tutorial 5

Analyzing Text with NLTK

Overview

This video explains how to use the Natural Language Toolkit (NLTK) library in Python to
analyze text data collected from Twitter. The goal is to clean and process the text to get
insights into what the tweets are about.

Collecting and Preprocessing Tweets

The video starts by reading in 152 tweets collected about the Serum Institute. To analyze the
tweets, the first step is to break them into individual words using NLTK's word tokenizer. This
converts each tweet into a list of word tokens.

All the tokens from all the tweets are added to one large list. This list is passed to the Counter
class from the Python collections module to get a count of each unique word.

Cleaning the Text

The most common words include some useful terms like "Serum" and "India" but also lots of
noise like punctuation. To clean this up:

• All text is lowercased so "Serum" and "serum" map to the same word
• Punctuation is removed using Python's string translate() method
• Stopwords like "the", "of", "at" are removed using NLTK's stopwords corpus
• Tokens less than 2 characters are removed to delete residuals like "rt"

After this cleanup, the most common words provide a much clearer signal on what the tweets
are about. Words like "fire", "vaccine", "Pune", "lives lost" indicate a fire at the Serum Institute
facility in Pune that resulted in loss of life.

Key Takeaways

• NLTK provides useful text processing capabilities like tokenization and stopwords
• Cleaning the text by lowercasing, removing punctuation/stopwords, etc greatly
improves analysis
• Even simple cleaning can reveal insights and topics within noisy text data like tweets
Tutorial 6

Using Gephi for Network Visualization

Introduction to Gephi

Gephi is an open source network analysis and visualization software used for research projects
in education, journalism, digital humanities etc. It can import social network data from
Facebook, Twitter etc. and generate graphs and clusters.
Loading Data into Gephi

• Gephi can read many file formats like gml, graphml, pajek net, uci net, dl files.
• To import from CSV, two files are needed - one with node list and one with edge list.
• Node CSV should have a column with unique node id.
• Edge list CSV should have source and target columns with node ids for each edge. Can
also include a column indicating edge type - directed or undirected.

Overview Tab

The Overview tab provides options to customize the visualization:

• Appearance tab: Change node/edge color, size based on classical and continuous
attributes.
• Layout tab: Choose from different layout algorithms like Fruchterman Reingold and
customize them.
• Filters tab: Filter nodes and edges based on attributes like degree, followers etc.
• Statistics tab: Compute network metrics like average degree, diameter, modularity etc.

Preview Tab

The Preview tab shows updated visualization options for the generated graph. New layouts
and filters can be tested without affecting existing workspace.

Data Laboratory

The Data Laboratory shows node and edge data as tables. Columns can be edited directly.
Specific nodes/edges can be selected for analysis.

Customizing Visualization

Many customizations are possible:

• Change node color, size based on attributes like followers, degree etc.
• Modify edge color, thickness based on weight.
• Adjust labels, fonts, background color.
• Different layout algorithms like Fruchterman Reingold.
• Filter nodes by criteria like followers between a range.
• Create subsets as new workspaces using filters.

Key Features

• Import wide variety of network file formats.

• Interactive visualization with multiple customization options.
• Filter and analyze subsections of networks.
• Compute various network metrics like density, diameter, modularity etc.
• Export graphs as PDF, PNG, SVG.

Tutorial 7

Introduction to Data Visualization with Python and Highcharts

The video introduces how to create interactive data visualizations using Python scripts and
Highcharts, a JavaScript charting library. It covers the following key concepts:

Creating Charts with Python Scripts

• Import the highcharts wrapper in Python to simplify generating visualizations

• Define a container to render the chart
• Set chart options like chart type, title, xaxis, yaxis etc to configure the chart
• Create data arrays with data points
• Add data sets along with chart options to the container
• Save the chart as HTML file that can be viewed in a browser

Four Types of Charts

The video demonstrates creating four types of charts from sample data:

Bar Chart

• Uses vertical/horizontal bars to represent grouped data

• Bar length represents the value proportionally

Line Chart

• Connects data points directly using straight lines

• Used to visualize data changing over time

Scatter Plot

• Pulls data points into individual dots, not connected

• Both axes represent some values like followers and friends

Bubble Chart

• Like scatter plot but bubble size represents additional quantity

• Can show 3 values: x-axis time, y-axis activity, z-axis volume

Using Highcharts Cloud

The Highcharts Cloud is a platform to instantly visualize data by just pasting it. It auto-
generates charts. Benefits:
• Don't need to write any code
• Customizable charts
• Interactive features like filtering, tooltips
• Can save and download charts

Key Takeaways

• Python scripts using Highcharts wrapper allow programatically generating

visualizations
• Highcharts Cloud provides easy interactive visualization without coding
• Many chart types available to visualize data based on purpose
• Visualizations make it easy to understand key insights from data

Python Beyond Limits: Python, #3
From Everand
Python Beyond Limits: Python, #3
AnwaarX
No ratings yet
SQL Guide
No ratings yet
SQL Guide
213 pages
Life in The Uniform by Amit Lodha
80% (5)
Life in The Uniform by Amit Lodha
287 pages
A Complete Beginner's Guide To Django
No ratings yet
A Complete Beginner's Guide To Django
279 pages
CE880 Lecture2 Slides
No ratings yet
CE880 Lecture2 Slides
42 pages
Mastering Git: Attain expert-level proficiency with Git by mastering distributed version control features
From Everand
Mastering Git: Attain expert-level proficiency with Git by mastering distributed version control features
Jakub Narębski
No ratings yet
Fundamentals of Linux With Notes
No ratings yet
Fundamentals of Linux With Notes
44 pages
GitHub Foundations Exam Prep: 500 Practice Questions with Detailed Explanations
From Everand
GitHub Foundations Exam Prep: 500 Practice Questions with Detailed Explanations
Satou Takahiro
No ratings yet
ML Projects
No ratings yet
ML Projects
135 pages
GitHub Actions Cookbook: A practical guide to automating repetitive tasks and streamlining your development process
From Everand
GitHub Actions Cookbook: A practical guide to automating repetitive tasks and streamlining your development process
Michael Kaufmann
No ratings yet
Python Machine Learning Projects
No ratings yet
Python Machine Learning Projects
70 pages
Blockchain Practical 1-4
No ratings yet
Blockchain Practical 1-4
17 pages
SIC Python Course Material
No ratings yet
SIC Python Course Material
74 pages
Machine Learning
100% (4)
Machine Learning
135 pages
Opensource
No ratings yet
Opensource
15 pages
Python Machine Learning Projects
No ratings yet
Python Machine Learning Projects
135 pages
Tools and Skills for .NET 8: Get the career you want with good practices and patterns to design, debug, and test your solutions 
From Everand
Tools and Skills for .NET 8: Get the career you want with good practices and patterns to design, debug, and test your solutions 
Mark J. Price
No ratings yet
TYPO3 Extension Development
From Everand
TYPO3 Extension Development
Dmitry Dulepov
No ratings yet
Lec 5
No ratings yet
Lec 5
10 pages
Toyota 5l
100% (2)
Toyota 5l
79 pages
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
From Everand
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
Adam Freeman
No ratings yet
Lec 13
No ratings yet
Lec 13
7 pages
Material Master Configuration - 10.07.2022
100% (1)
Material Master Configuration - 10.07.2022
13 pages
Python Full Stack
No ratings yet
Python Full Stack
37 pages
Version Control with Git for New Developers: A Practical Guide with Examples
From Everand
Version Control with Git for New Developers: A Practical Guide with Examples
William E. Clark
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
Learning Continuous Integration with TeamCity
From Everand
Learning Continuous Integration with TeamCity
Manoj Mahalingam S
No ratings yet
Cloud Computing and Virtualization Record
No ratings yet
Cloud Computing and Virtualization Record
29 pages
Collection of Raspberry Pi Projects
From Everand
Collection of Raspberry Pi Projects
Guillermo Perez Guillen
5/5 (1)
Command Line Git - Everything You Need To Know To Get Started
From Everand
Command Line Git - Everything You Need To Know To Get Started
Maksim Ivanov
No ratings yet
Machine Learning Projects in Python
100% (16)
Machine Learning Projects in Python
135 pages
Lighttpd
From Everand
Lighttpd
Andre Bogus
4/5 (2)
Getting Started with Knockout.js for .NET Developers
From Everand
Getting Started with Knockout.js for .NET Developers
Andrey Akinshin
No ratings yet
Learning Docker
From Everand
Learning Docker
Pethuru Raj
5/5 (5)
Gitolite Essentials
From Everand
Gitolite Essentials
Sitaram Chamarty
No ratings yet
ReSharper Essentials
From Everand
ReSharper Essentials
Łukasz Gąsior
3.5/5 (1)
Building Websites with VB.NET and DotNetNuke 3.0
From Everand
Building Websites with VB.NET and DotNetNuke 3.0
Daniel N. Egan
1/5 (1)
Python Automation for Beginners: A Practical Guide with Examples
From Everand
Python Automation for Beginners: A Practical Guide with Examples
William E. Clark
No ratings yet
Maroche MG164 UM
No ratings yet
Maroche MG164 UM
74 pages
Go Programming Blueprints
From Everand
Go Programming Blueprints
Mat Ryer
No ratings yet
Python OOP Step by Step: A Practical Guide with Examples
From Everand
Python OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Version Control with Git: From Basics to Expert Proficiency
From Everand
Version Control with Git: From Basics to Expert Proficiency
William Smith
No ratings yet
Git Essentials
From Everand
Git Essentials
Ferdinando Santacroce
4.5/5 (4)
Iiot Merged
No ratings yet
Iiot Merged
59 pages
Python Basics Made Simple: A Practical Guide with Examples
From Everand
Python Basics Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
CodeIgniter 1.7
From Everand
CodeIgniter 1.7
David Upton
No ratings yet
Salesforce Course Content PDF
No ratings yet
Salesforce Course Content PDF
8 pages
FuelPHP Application Development Blueprints
From Everand
FuelPHP Application Development Blueprints
Sébastien Drouyer
No ratings yet
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
A Complete Beginners Guide To Django - Part 1
No ratings yet
A Complete Beginners Guide To Django - Part 1
22 pages
Installing and Using Git and GitHub On Ubuntu Linux
No ratings yet
Installing and Using Git and GitHub On Ubuntu Linux
4 pages
WireBeings An Expandable Robot Chassis For Arduino
No ratings yet
WireBeings An Expandable Robot Chassis For Arduino
9 pages
Visual Studio Code: End-to-End Editing and Debugging Tools for Web Developers
From Everand
Visual Studio Code: End-to-End Editing and Debugging Tools for Web Developers
Bruce Johnson
No ratings yet
Authentic - V1?
No ratings yet
Authentic - V1?
5 pages
Mastering Python Programming: A Comprehensive Guide: The IT Collection
From Everand
Mastering Python Programming: A Comprehensive Guide: The IT Collection
Christopher Ford
5/5 (1)
Microprocessors and Peripherals: Lab Programs - 2019
No ratings yet
Microprocessors and Peripherals: Lab Programs - 2019
40 pages
Ry's Git Tutorial
From Everand
Ry's Git Tutorial
Ryan Hodson
No ratings yet
Cloud - Business Profile
No ratings yet
Cloud - Business Profile
14 pages
Automatic Greenhouse Monitoring and Control: Project by Challa Mukund Saianth
No ratings yet
Automatic Greenhouse Monitoring and Control: Project by Challa Mukund Saianth
26 pages
Python Programming: Learn, Code, Create
From Everand
Python Programming: Learn, Code, Create
Sachin Naha
No ratings yet
The Beginner’s Guide to GitHub
From Everand
The Beginner’s Guide to GitHub
Steven Mcananey
No ratings yet
Ansible for IT Experts
From Everand
Ansible for IT Experts
Denis Zuev
No ratings yet
The Beginner’s Guide to AI - Aider
From Everand
The Beginner’s Guide to AI - Aider
Steven Mcananey
No ratings yet
Detailed Lesson Plan in English 7 Using Card Catalog, Opac, and Search Engine To Locate Information
No ratings yet
Detailed Lesson Plan in English 7 Using Card Catalog, Opac, and Search Engine To Locate Information
6 pages
Projects with IOTA
From Everand
Projects with IOTA
Guillermo Perez Guillen
No ratings yet
Mipt 2024 Merged Assignment
No ratings yet
Mipt 2024 Merged Assignment
47 pages
Installation Testing
No ratings yet
Installation Testing
12 pages
HART® Transmitter Calibration
No ratings yet
HART® Transmitter Calibration
16 pages
Chapter 05 PCPF
No ratings yet
Chapter 05 PCPF
32 pages
AWS 0 Size
No ratings yet
AWS 0 Size
1 page
Web Scraping for SEO with Python
From Everand
Web Scraping for SEO with Python
Enrique Vicente
No ratings yet
IOT Embedded Projects List 2021 - 2022
No ratings yet
IOT Embedded Projects List 2021 - 2022
10 pages
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
Efresh Paper
No ratings yet
Efresh Paper
4 pages
Git Basics and Version Control
From Everand
Git Basics and Version Control
Sankar Srinivasan
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
CA October 2023 PrepBee
No ratings yet
CA October 2023 PrepBee
43 pages
OAT Unit-5
No ratings yet
OAT Unit-5
8 pages
Lec 4
No ratings yet
Lec 4
38 pages
CS1101-DF-Unit 5 - Strings and Iterations
No ratings yet
CS1101-DF-Unit 5 - Strings and Iterations
7 pages
AiPhen - Solution Challenge - Project Submission
No ratings yet
AiPhen - Solution Challenge - Project Submission
13 pages
.Net Framework and Programming in ASP.NET
From Everand
.Net Framework and Programming in ASP.NET
Priyanka Agarwal
No ratings yet
Excel Clone
No ratings yet
Excel Clone
33 pages
Assignment 1 - Merged
No ratings yet
Assignment 1 - Merged
24 pages
ADB Bearing Sensor Tester
No ratings yet
ADB Bearing Sensor Tester
2 pages
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
From Everand
Spring Boot Intermediate Microservices: Resilient Microservices with Spring Boot 2 and Spring Cloud
Jens Boje
No ratings yet
TR 123
No ratings yet
TR 123
19 pages
01 VESDAnet TCPIP Connectivity Application Note
No ratings yet
01 VESDAnet TCPIP Connectivity Application Note
9 pages
Article Backlog
No ratings yet
Article Backlog
9 pages
Python Mysql Connectivity Lab Work
No ratings yet
Python Mysql Connectivity Lab Work
3 pages
The Youtube Social Network: Mirjam Wattenhofer Roger Wattenhofer Zack Zhu
No ratings yet
The Youtube Social Network: Mirjam Wattenhofer Roger Wattenhofer Zack Zhu
9 pages
Project Report - Movie Ticket Booking System - HCI Project
No ratings yet
Project Report - Movie Ticket Booking System - HCI Project
11 pages
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
Rezilens Profile-New
No ratings yet
Rezilens Profile-New
13 pages
Article Backlog
No ratings yet
Article Backlog
4 pages
Control Structure C
No ratings yet
Control Structure C
12 pages
Mod 5
No ratings yet
Mod 5
5 pages
How To Find The Product Model of Your Dell Computer - Dell India
No ratings yet
How To Find The Product Model of Your Dell Computer - Dell India
3 pages
TCL Linkzone Mw63 Factsheet Final-1
No ratings yet
TCL Linkzone Mw63 Factsheet Final-1
2 pages
Signals and Systems T - Sheet
No ratings yet
Signals and Systems T - Sheet
2 pages
GoodSpace OA Shortlisted NSIT DEC 2023
No ratings yet
GoodSpace OA Shortlisted NSIT DEC 2023
4 pages
B8 Comp WK1 1
No ratings yet
B8 Comp WK1 1
2 pages
PHD Position
No ratings yet
PHD Position
2 pages
SRS - How to build a Pen Test and Hacking Platform
From Everand
SRS - How to build a Pen Test and Hacking Platform
alasdair gilchrist
2/5 (1)

Psosm Tutorial Notes

Uploaded by

Psosm Tutorial Notes

Uploaded by

Tutorial 1

Installing Ubuntu in VirtualBox

Downloading Ubuntu and VirtualBox

Creating the Virtual Machine

A Beginner's Guide to Git and GitHub

What is Git and why use it?

• Track code changes over time with commits

Basic Git Commands

Here are some basic Git commands to get started:

• git init - Initializes a Git repository in the current folder

Collaborating with GitHub

• Create a repository on GitHub

Collecting Reddit Data with PRAW

• Reddit has subreddits, posts, comments, upvotes for user interaction

Collecting and Storing Twitter Data in MySQL Database

Then MySQL is configured using the sudo mysql_secure_installation command. This

Connecting Python to MySQL

Creating a MySQL Table to Store Tweets

Streaming Tweets from Twitter API into MySQL

The tweets are inserted using an INSERT IGNORE query with

Querying Stored Tweets from MySQL

Understanding Social Media Data as Network Graphs

Visualizing the Twitter Graph with Gephi

Key Graph Metrics

Some key metrics used in social network analysis:

• Centrality - Identifies important or central nodes. Common centrality measures are

• Modularity - Quantifies the strength of communities, based on the fraction of edges

Analyzing Text with NLTK

Collecting and Preprocessing Tweets

Cleaning the Text

Using Gephi for Network Visualization

The Overview tab provides options to customize the visualization:

Many customizations are possible:

• Import wide variety of network file formats.

Introduction to Data Visualization with Python and Highcharts

Creating Charts with Python Scripts

• Import the highcharts wrapper in Python to simplify generating visualizations

Four Types of Charts

• Uses vertical/horizontal bars to represent grouped data

• Connects data points directly using straight lines

• Pulls data points into individual dots, not connected

• Like scatter plot but bubble size represents additional quantity

Using Highcharts Cloud

• Python scripts using Highcharts wrapper allow programatically generating

You might also like