0% found this document useful (0 votes)
3 views

Linux_Commands_Developer_Data_Engineer

The document provides a comprehensive list of essential Linux commands for developers and data engineers, categorized into file and directory management, text processing, networking, data engineering-specific tools, process management, version control with Git, system monitoring, disk usage, archive and compression, and development utilities. Each command is accompanied by a brief description and example usage. This serves as a quick reference guide for performing various tasks in a Linux environment.

Uploaded by

studyhacks88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Linux_Commands_Developer_Data_Engineer

The document provides a comprehensive list of essential Linux commands for developers and data engineers, categorized into file and directory management, text processing, networking, data engineering-specific tools, process management, version control with Git, system monitoring, disk usage, archive and compression, and development utilities. Each command is accompanied by a brief description and example usage. This serves as a quick reference guide for performing various tasks in a Linux environment.

Uploaded by

studyhacks88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Linux Commands for Developers and Data Engineers

File and Directory Management

ls - Lists files and directories in the current directory. Example: ls -l shows

details.

cd - Changes the current directory. Example: cd /home navigates to the /home

directory.

pwd - Displays the current working directory.

mkdir - Creates a new directory. Example: mkdir project creates a folder named

'project'.

rm - Deletes files or directories. Example: rm file.txt deletes 'file.txt'. Use rm -r for

directories.

cp - Copies files or directories. Example: cp file1.txt file2.txt copies file1.txt to

file2.txt.

mv - Moves or renames files and directories. Example: mv old.txt new.txt renames

old.txt to new.txt.

find - Searches files and directories. Example: find / -name file.txt looks for 'file.txt'.

Text Processing (Critical for Data Engineering)

cat - Displays file contents. Example: cat file.txt shows the content of 'file.txt'.

less - Views file content page by page. Example: less file.txt.

grep - Searches for patterns in files. Example: grep 'error' log.txt finds 'error' in

log.txt.

awk - Processes and analyzes text data. Example: awk '{print $1}' file.txt prints the

first column.
sed - Performs text substitution and manipulation. Example: sed 's/old/new/g'

file.txt replaces 'old' with 'new'.

cut - Extracts specific columns from files. Example: cut -d',' -f2 file.csv extracts the

second column.

sort - Sorts file contents. Example: sort file.txt sorts lines alphabetically.

uniq - Removes duplicate lines. Example: uniq file.txt outputs unique lines.

wc - Counts lines, words, or characters. Example: wc -l file.txt counts lines.

Networking

ping - Tests connectivity to a host. Example: ping google.com.

curl - Fetches data from URLs. Example: curl http://example.com downloads the

page content.

wget - Downloads files from the internet. Example: wget http://example.com/file.zip.

scp - Securely copies files between servers. Example: scp file.txt user@host:/path

transfers file.txt.

netstat - Displays network connections, routing tables, etc.

ss - Shows detailed network statistics. Example: ss -tuln displays listening ports.

ftp - Transfers files using the FTP protocol. Example: ftp hostname.

Data Engineering-Specific Tools

hdfs dfs - Manages Hadoop Distributed File System (HDFS). Example: hdfs dfs -ls /

lists HDFS contents.

spark-submit - Submits Spark jobs. Example: spark-submit app.py runs a PySpark

application.
sqoop - Transfers data between Hadoop and relational databases.

kafka-console-producer
- Publishes messages to a Kafka topic.

kafka-console-consumer
- Reads messages from a Kafka topic.

flume-ng - Configures Flume agents to ingest data streams.

Process Management

ps - Displays current running processes. Example: ps aux shows all processes

with details.

top - Displays real-time system resource usage and running processes.

htop - An interactive process viewer (similar to top).

kill - Terminates a process by its PID. Example: kill 1234 kills the process with

PID 1234.

bg - Resumes a suspended job in the background.

fg - Resumes a job in the foreground.

Version Control (Git)

git init - Initializes a new Git repository.

git clone - Clones an existing repository. Example: git clone <repo_url>.

git add - Stages changes for commit. Example: git add file.txt.

git commit - Commits staged changes. Example: git commit -m 'message'.

git push - Pushes changes to a remote repository. Example: git push origin main.

git pull - Fetches and merges changes from a remote repository.

System Monitoring and Disk Usage


df - Displays disk space usage. Example: df -h shows human-readable disk

usage.

du - Shows directory size. Example: du -sh /home gives the size of /home.

free - Displays memory usage. Example: free -h shows human-readable memory

usage.

uptime - Shows how long the system has been running.

Archive and Compression

tar - Archives files. Example: tar -cvf archive.tar file.txt creates an archive.

gzip - Compresses files. Example: gzip file.txt compresses 'file.txt'.

gunzip - Decompresses files. Example: gunzip file.txt.gz decompresses 'file.txt.gz'.

Development Utilities

vim - Edits text files in the terminal. Example: vim file.txt opens 'file.txt' for editing.

nano - A simple text editor. Example: nano file.txt opens 'file.txt' for editing.

ssh - Connects to remote servers securely. Example: ssh user@host.

screen - Allows detached terminal sessions. Example: screen starts a session.

You might also like