Open Source Demystified Level 1 Bachelor of Computer Applications (Bca)
Open Source Demystified Level 1 Bachelor of Computer Applications (Bca)
Open Source Demystified Level 1 Bachelor of Computer Applications (Bca)
Project Report submitted in partial fulfilment of the requirements for the award of the degree of
BACHELOR OF COMPUTER APPLICATIONS(BCA)
CERTIFICATE OF COMPLETION
This is to certify that the practical lab for the course titled “Open
Source Demystified Level 1” has been satisfactorily completed by
Davis Arnold M, 21BCAF20 in partial fulfilment of the award of the
Bachelor of Computer Applications degree requirements prescribed
by Kristu Jayanti College (Autonomous) Bengaluru (Affiliated to
Bangalore University) during the academic year 2022-2023.
External Mentor
Valued by Examiners
1:_____________________
2: ____________________
I, Davis Arnold M, 21BCAF20 hereby declare that the practical lab work
for the course titled “Open Source Demystified Level 1” has been
completed by me, as per the course guidelines, under the guidance of
Ayshwarya B.
This report work has not been submitted earlier either to any University /
Institution or any other body for the fulfilment of the requirement of a
course of study.
Signature
Davis Arnold M
21BCAF20
Location:
Date:
ACKNOWLEDGEMENT
<Acknowledgement Text go here!>
TABLE OF CONTENTS
SYNOPSIS ....................................................................................................................................
.......... 9
Glossary ........................................................................................................................................
........ 10
Introduction ...................................................................................................................................
........ 12
About this document ......................................................................... Error! Bookmark not
defined.
Purpose .............................................................................................. Error! Bookmark not
defined.
Audience ........................................................................................... Error! Bookmark not
defined.
Open Source Introduction ..................................................................... Error! Bookmark not
defined.
Open Source Project Examples ............................................................. Error! Bookmark not
defined.
1) Cortx
............................................................................................................................. 21
Introduction ..........................................................................................................................
......... 21
Project
Summary ...........................................................................................................................
21
Project
Details ...............................................................................................................................
22
Project
References ........................................................................................................................
25
2) Foniod
Introduction .......................................................................................................................
......... 27
Project
Summary ...........................................................................................................................
27
Project
Details ...............................................................................................................................
28
Project
References ........................................................................................................................
32
3) Apache Beam
Introduction ..........................................................................................................................
......... 35
Project
Summary ...........................................................................................................................
35
Project
Details ...............................................................................................................................
36
Project
References ........................................................................................................................
39
How to contribute to Open
Source? ...................................................................................................... 39
Ways to
Contribute............................................................................................................................ 43
Methods to join the community and start
contributing ..................................................................... 43
Contribution
Flowchart ..................................................................................................................... 43
Community Engagement
Experience .................................................................................................... 44
My
Contributions ................................................................................................................................
.. 45
Open Source
Value ................................................................................................................................ 45
References .....................................................................................................................................
........ 45
List of Figures
Figure 1 : Sample PictureFigure 2: Sample
picture .............................................................................. 20
List of Tables
Table 1: Sample
table ............................................................................................................................ 21
SYNOPSIS
The understanding of open source technology and ecosystem is covered in the Open
Source Demystified Level 1 course. It offers a fundamental overview of open source,
including definitions, the ecosystem, community, how to participate, key potential,
and open source culture. This knowledge makes it easier to recognize, enter the
workforce, contribute, learn, and advance one's career. Additionally, it offers practical
exposure to a particular open source project's ecosystem. This project report details
the lessons learned overall and the tasks finished for the course. The overall summary
and recommendations provided in this report will help one understand how to
continue to contribute to open source projects.
Glossary
1. Soda Foundation - Under the auspices of the Linux Foundation, the Soda
Foundation is an open source initiative with the goal of promoting an
ecosystem of free and open source data management and storage applications.
3. Committing the Changes- To save your changes to the local repository, use
the "Commit" command..
Command – git add* git
commit -m ‘your commit message’
4. Pulling the Changes – git pull command is used to fetch and download content
from a remote repository and immediately update the local repository to match
that content.
Command – git pull
5. Pushing the Changes - git push command is used to upload local repository
content to a remote repository.
Command – git add <file – path>
git add <file – path><file2 – path>
6. Free Software - Software that respects users' freedom and the community is
known as "free software," and users have the freedom to run, copy, distribute,
study, alter, and improve the software..
11. Git - Git is a DevOps tool for managing source code. It is a version control
system that is free and open-source and can effectively manage small to very
large projects..
12. GitHub - A platform for collaboration and version control is called GitHub. It
enables remote collaboration on projects between you and other people..
13. Slack - Slack is a business messaging app that links users to the data they
require. Slack changes how businesses communicate by bringing people
together to work as a single, cohesive team.
14. Git status - The status of the working directory and the staging area are shown
by the git status command.
Command – git status
SODA Foundationis an open source project under Linux Foundation that aims to
foster an ecosystem of open source data management and storage software for data
autonomy. SODA Foundation offers a neutral forum for cross-projects collaboration
and integration and provides end users quality end-to-end solutions.
SODA is SODA Open Data Autonomy. It is an open source unified autonomous data
framework for data mobility from edge to core to cloud.
SODA Foundation focuses to build unified frameworks, APIs and solutions in the
areas of
• Data Mobility
• Data Protection
• Data Lifecycle
• Unified Storage Platform
• Cloud Native Storage
• Data Governance
• Data Orchestration
• Data Energy and more. It envisions to provide data autonomy through
its open source solutions and standards.
SODA Foundation is a home of all the projects for storage and data. It hosts many
projects and also extends the ecosystem through partners and third party projects
which can help to build unified data solutions for various use cases.
The SODA Ecosystem has many projects under its umbrella, which work in unison to
solve the various data and storage challenges. Some of the important ones are :
• SODA Controller In the API flow, controller plays a critical role for all
the API flow management and tracking to handle all the state machine
and metadata management requirements.
• SODA Dock It is a docking station for heterogeneous storage backends!
This is where all the different storage vendors’ drivers for various
backends get attached.
The official charter for SODA Foundation under Linux Foundation can
be found https://sodafoundation.io/the-foundation/charter/
An open source development model is the process used by an open source community
project to develop open source software. The software is then released under an open
source license, so anyone can view or modify the source code.
Many open source projects are hosted onGitHub, where you can access repositories or
get involved in community projects.Linux®, Ansible, and Kubernetes are examples of
popular open source projects.
There are lots of reasons why people choose open source over proprietary software,
but the most common ones are:
• Peer review: Because the source code is freely accessible and the open source
community is very active, open source code is actively checked and improved upon
by peer programmers. Think of it as living code, rather than code that is closed and
becomes stagnant.
• Transparency: Need to know exactly what kinds of data are moving where, or what
kinds of changes have happened in the code? Open source allows you to check and
track that for yourself, without having to rely on vendor promises.
• Reliability: Proprietary code relies on the single author or company controlling that
code to keep it updated, patched, and working. Open source code outlives its original
authors because it is constantly updated through active open source communities.
Open standards and peer review ensure that open source code is tested appropriately
and often.
• Flexibility: Because of its emphasis on modification, you can use open source code to
address problems that are unique to your business or community. You aren’t locked in
to using the code in any one specific way, and you can rely on community help and
peer review when you implement new solutions.
• Lower cost: With open source the code itself is free—what you pay for when you use
a company like Red Hat is support, security hardening, and help managing
interoperability.
• No vendor lock-in: Freedom for the user means that you can take your open source
Open collaboration : The existence of active open source communities means that you can
find help, resources, and perspectives that reach beyond one interest group or one company.
Terra
Terra
Introduction
SODA Controller is an open source implementation for all the control services (like
metadata management, scheduler, other bookkeeping, utils etc) . This is currently
added a separate repository considering many core services could be developed under
this for the overall data store framework.
It is part of SODA Terra (SDS Controller). There are other two repositories part of
SODA Terra viz., API and Dock
In the API flow from SODA API to SODA DOCK, controller plays a critical role for
all the API flow management and tracking to handle all the state machine and
metadata management requirements. This will be a layer to keep addons to new
services or facilities or utilities for the soda data platform
This layer can be optional going forward or pick and use needed services from the
controller during the deployment. However, the users need to do certain integration
with api and dock for their controller modules in such cases
Controller interfaces with SODA api and dock.
This is one of the SODA Core Projects and is maintained by SODA Foundation
directly..
Project Summary
Website https://www.seagate.com/products/storage/object-storage-software/
Open/Proprietary Open-source
https://github.com/delfinproject/delfin
Source Path(if open source)
Brief Description The Delfin project is an open-source initiative that develops software
tools to simplify data management for non-profit organizations. It
provides a unified data management platform that leverages open-source
technologies for data ingestion, processing, storage, and analytics, as
well as data governance, security, and compliance. The project aims to
empower non-profit organizations with the tools they need to manage
complex and rapidly growing data volumes, enabling them to make
informed decisions and achieve their mission more effectively.
Table 1: Project Summary
Project Details
Key Features
Key features of the Delfin project's unified data management platform include:
Data ingestion and processing: The platform provides tools for ingesting and processing
large and complex data volumes, allowing organizations to make sense of their data and
derive insights.
Data storage and analytics: The platform offers scalable and reliable data storage options,
as well as powerful analytics tools for gaining deeper insights into the data.
Data governance, security, and compliance: The platform includes features for managing
data governance, ensuring data security, and complying with regulations.
Customization and flexibility: The platform is highly customizable and flexible, allowing
organizations to tailor it to their specific needs and workflows.
User-friendly interface: The platform is designed with a user-friendly interface that makes it
easy for non-technical users to work with data and derive insights.
Machine learning algorithms: The platform leverages machine learning algorithms to help
organizations make sense of their data and discover patterns and insights.
Integration with other tools: The platform can be integrated with other tools and systems,
such as data visualization tools, business intelligence tools, and CRM systems, to create a
complete data management and analysis ecosystem.
Multi-tenant support: The platform provides support for multiple tenants, allowing different
departments or teams within an organization to work with their own data sets and workflows.
Architecture
The Delfin project is an open-source initiative that simplifies data management for non-profit
organizations dealing with complex and rapidly growing data volumes. It achieves this through a
modern and scalable architecture that leverages existing open-source technologies.
The architecture of the Delfin project can be divided into four main layers:
• Data Ingestion Layer: This layer is responsible for collecting and processing data from various
sources. It includes connectors for various data sources such as databases, file systems, and
streaming platforms. The data is then transformed and enriched to prepare it for storage and analysis.
• Data Storage and Processing Layer: This layer is responsible for storing and processing the data
ingested by the previous layer. It includes storage systems such as Hadoop Distributed File System
(HDFS) and object storage systems such as OpenSDS. Processing is done using technologies such as
Apache Spark and Kubernetes.
• Data Analysis and Governance Layer: This layer is responsible for providing analytics and
governance capabilities. It includes tools for data visualization, reporting, and governance such as
Apache Superset and OpenDS4All. It also includes tools for data security and compliance such as
Open Policy Agent and Apache Ranger.
• Data Management Layer: This layer is responsible for managing the data lifecycle. It includes
tools for data discovery, cataloging, and metadata management such as Apache Atlas and Open
Metadata. It also includes machine learning algorithms for optimizing data management operations
such as data placement, replication, and compression.
The Delfin project architecture is designed to be cloud-native and can be deployed on various cloud
platforms or on-premises infrastructure. It also supports various data formats, including structured
and unstructured data, making it highly flexible and scalable.
Current Usage
The Delfin project is a relatively new initiative and its usage is still growing. However, it has already
gained traction among some organizations who are looking for a unified data management platform
to help them manage their data more efficiently.
One example of a current usage of the Delfin project is in the healthcare industry. Delfin is being
used by healthcare organizations to manage and analyze large volumes of patient data, including
electronic health records, medical imaging, and clinical data. Delfin's machine learning algorithms
and data analytics capabilities are helping healthcare organizations gain insights into patient care,
optimize resource allocation, and improve patient outcomes.
Technical Details
The Delfin project, which is part of the SODA Foundation, is designed to provide a highly scalable
and performant data management platform for enterprises dealing with complex and rapidly growing
data volumes. Its architecture is cloud-native and can scale horizontally and vertically
aneeded,usingtechnologies such as Kubernetes and Hadoop Distributed File System (HDFS). This
allows it to handle large and rapidly growing data volumes with ease.
Delfin supports various data formats, including structured and unstructured data, and
provides tools for data visualization, reporting, and governance. It also includes
machine learning algorithms for optimizing data management operations. Being an
open-source project, Delfin is licensed under the Apache 2.0 license, which provides
transparency and flexibility for users.
Overall, the technical details of the Delfin project highlight its ability to provide a
modern, scalable, and high-performing data management platform that can meet the
needs of enterprises with complex and rapidly growing data volumes.
Project website:https://sodafoundation.io/projects/delfin/
Github repository: https://github.com/sodafoundation/delfin/
• SODA Foundation website: https://sodafoundation.io/
• Apache Spark website:https://spark.apache.org/
• OpenSDS website:https://opensds.io/
• Kubernetes website: https://kubernetes.io/
• Hadoop website: https://hadoop.apache.org/
Acknowledgements:
I would like to acknowledge the SODA Foundation for their contributions to the
open-source community and their development of the DELFINproject. Additionally, I
would like to thank the Apache Spark, OpenSDS, Kubernetes, and Hadoop
communities for their contributions to the development of the technologies used in
DELFIN.
Project Summary
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-
Website
dashboard/
License
Apache License 2.0
Open/Proprietary Open-source
https://github.com/kubernetes/dashboar
Source Path(if open source)
Brief Description
Kubernetes Dashboard is a web-based user interface for managing
Kubernetes clusters. It provides a graphical interface for deploying,
scaling, and monitoring applications, as well as managing cluster
resources and accessing logs. The dashboard is highly customizable and
provides role-based access control for managing cluster resources.
Project Details
Key Features
Kubernetes Dashboard is a web-based graphical user interface (GUI) for managing
Kubernetes clusters. It provides an easy-to-use interface for viewing and managing
Kubernetes resources, such as deployments, pods, services, and replica sets. Some key
features of the Kubernetes Dashboard include:
• Real-Time Resource Monitoring: The dashboard provides real-time monitoring of
Kubernetes resources, allowing users to easily monitor the performance of their
applications.
• Intuitive User Interface: The dashboard has an intuitive user interface that makes it
easy to navigate and manage Kubernetes resources.
• Simplified Resource Management: Users can manage Kubernetes resources
through the dashboard without the need for complex command-line tools.
• Role-Based Access Control: The dashboard supports role-based access control
(RBAC), which allows administrators to control access to the Kubernetes cluster
based on the roles of individual users.
• Interactive Resource Editing: The dashboard provides an interactive interface for
editing Kubernetes resources, making it easy to modify resources without having to
edit YAML files directly.
• Cluster-Level Metrics: The dashboard provides cluster-level metrics, allowing
users to monitor the overall health and performance of the Kubernetes cluster.
Overall, the Kubernetes Dashboard is a powerful tool for managing Kubernetes
clusters, providing an easy-to-use interface for monitoring and managing resources.
Architecture
The Kubernetes Dashboard is a web-based user interface that allows users to manage and monitor
their Kubernetes clusters. It is built using a microservices architecture, with different components
responsible for different tasks. The dashboard frontend is built using AngularJS and interacts with
the Kubernetes API server to retrieve information about the cluster. The backend consists of several
components, including the Heapster service, which collects resource utilization data, and the Ingress
controller, which provides secure access to the dashboard. The dashboard can be deployed using
various methods, including YAML files and Helm charts.
Current Usage
Kubernetes Dashboard is used as a web-based graphical user interface (GUI) for managing
Kubernetes clusters.
It allows users to easily monitor and control the state of their Kubernetes resources such as
deployments, pods, services, and replicasets.
Kubernetes Dashboard can be used for troubleshooting and debugging issues within a
Kubernetes cluster by providing real-time status updates and logs of the running containers.
It is also used to manage and configure Kubernetes cluster settings and security policies.
Kubernetes Dashboard is commonly used by DevOps teams and developers to deploy and
manage applications on Kubernetes clusters.
Additionally, it provides an easy-to-use interface for managing Kubernetes resources, making
it accessible for users with varying levels of technical expertise.
Architecture:
Acknowledgements:
I would like to express my heartfelt gratitude to everyone who has contributed to this project
in one way or another. Your support and encouragement have been instrumental in making
this project a success.
I would like to thank [random name] for their valuable insights and feedback, which helped
me to improve the quality of this project. I am also grateful to Kubernetes team for their
technical assistance and guidance.
I would like to extend my appreciation to project lead for their support and encouragement
throughout the project. Their motivation kept me going, even during challenging times.
I would also like to acknowledge the support of Kubernetes project , who helped me to stay
organized and focused throughout the project. Their assistance with project management was
invaluable.
Finally, I would like to thank my family and friends for their love, encouragement, and
support. Without them, this project would not have been possible.
Open Source Project
WordPress
WordPress project
Introduction
WordPress is a free and open-source content management system (CMS) that
powers over 40% of all websites on the internet. It was first released in 2003
and has since grown to become the most popular CMS in the world.
As an open-source software, WordPress is freely available to download, use,
modify, and distribute. This means that anyone can access the source code of
WordPress, make changes to it, and share those changes with others. This
collaborative and decentralized approach has allowed WordPress to evolve
and improve over time, benefiting from the input of a large community of
developers, designers, and users.
WordPress is written in PHP and uses a MySQL database to store content. It is
designed to be user-friendly and customizable, with a wide range of themes
and plugins available to extend its functionality. Users can create and publish
content easily, manage comments, and interact with their audience through
built-in social media integrations.
Overall, WordPress's open-source nature has enabled it to become the
backbone of the internet, powering everything from personal blogs to major
news websites and online stores.
Project Summary
Website https://wordpress.org/
Organization/Foundation The WordPress Foundation
Name
License GPLv2
mnoOpen/Proprietary Open-source
Project Details
Key Features
• User-friendlyinterface: WordPress provides a user-friendly dashboard that allows users to
easily create and manage content, customize their website, and interact with their audience.
• Customizable: WordPress offers a wide range of themes and plugins that users can use to
customize their website's design and functionality.
• Open-source: WordPress is an open-source software that is freely available to download,
use, modify, and distribute, making it accessible to anyone who wishes to use it.
• SEO-friendly: WordPress is designed to be SEO-friendly, with built-in tools that help users
optimize their content for search engines.
• Security: WordPress takes security seriously, with regular updates and security patches to
protect against potential vulnerabilities.
• Large and active community: WordPress has a large and active community of developers,
designers, and users who contribute to its evolution and improvement over time.
• E-commerce ready: WordPress offers a range of plugins and integrations for e-commerce
websites, making it easy to set up and manage online stores.
Architecture
WordPress architecture is built on top of the LAMP stack, which includes the following components:
WordPress uses a modular architecture, with the core software providing basic functionality and
additional features added through plugins and themes. The WordPress core includes essential
features such as user management, content creation, and website management. Plugins and themes
can be installed and activated to add additional functionality and customize the website's design and
layout.
.
Current usage
• WordPress powers over 40% of all websites on the internet, making it the most
widely used CMS in the world.
• WordPress can be customized through the use of themes and plugins, which allow
users to add new features and change the appearance of their websites.
• WordPress is regularly updated with new features and security patches to ensure the
software remains stable and secure.
• WordPress is used for a wide range of websites, including blogs, e-commerce sites,
portfolios, and more.
• WordPress is compatible with a variety of web hosting services and can be installed
on most web servers.
• Theme system: WordPress uses a theme system that allows developers to change
the look and feel of a website without changing the underlying code. This makes it
easy to create custom designs for websites.
• Plugin system: WordPress uses a plugin system that allows developers to extend the
functionality of the CMS. There are over 58,000 free plugins available in the official
WordPress plugin repository, and countless more available from third-party sources.
• Scalable: WordPress is highly scalable and can be used to build websites of any size
or complexity. It can handle large amounts of traffic and content, and can be easily
extended with additional resources like caching and load balancing.
• Large community: WordPress has a large and active community of developers and
users who contribute to the software, provide support, and create plugins and themes.
Other information
WordPress is also a constantly evolving platform, with new updates and features being
released regularly. Some of the latest updates to WordPress include:
• WordPress 5.8: Released in July 2021, WordPress 5.8 introduced several new features, including
support for WebP images, improvements to the block editor, and enhancements to the theme editor.
• Block Editor: The block editor, also known as Gutenberg, is a new way of creating content in
WordPress. It uses blocks to organize content and allows for more flexible and dynamic layouts.
• WordPress REST API: The WordPress REST API allows developers to access and manipulate
WordPress content using HTTP requests. This opens up new possibilities for integrating WordPress
with other tools and services.
• WordPress Plugins: WordPress has a vast ecosystem of plugins, which are third-party tools that
can extend the functionality of a WordPress site. There are thousands of plugins available for
WordPress, ranging from simple utilities to complex systems for managing e-commerce sites.
• WordPress Themes: WordPress themes control the appearance of a WordPress site. There are
thousands of free and premium themes available for WordPress, allowing users to easily customize
the look and feel of their site.
Overall, WordPress offers a flexible and customizable platform for building websites of any size or
complexity. With its vast ecosystem of plugins and themes, as well as its powerful REST API,
WordPress is a popular choice for developers and website owners alike.
Project References
Contributing to open source can be a great way to learn and grow as a developer
while making a positive impact on the community. Here are some steps to get
started:
• Choose a project: Find an open-source project that interests you and aligns with
your skills. Look for projects with active communities, good documentation, and
open issues that need attention.
• Set up the environment: Follow the project's instructions for setting up a
development environment, including installing any necessary dependencies and
tools.
• Find an issue to work on: Look through the project's issue tracker and find an
issue that you can work on. Start with simpler issues and work your way up to
more complex ones as you gain experience.
• Read the codebase: Take some time to understand the codebase and how the
project works. Read through the documentation and any relevant resources to
gain a better understanding of the project's goals and architecture.
• Write your code: Once you have a good understanding of the project and the
issue you're working on, write your code. Follow the project's coding standards
and practices, and make sure to write clear, concise code with good
documentation.
• Submit a pull request: When you're ready to submit your code, create a pull
request on the project's repository. Make sure to include a clear description of the
changes you've made, and be responsive to any feedback or requests for changes
from the project maintainers.
• Stay engaged: Once your code is merged, continue to stay engaged with the
project community. Offer help to others who are contributing, and keep an eye
out for new issues or opportunities to contribute.
Proprietary Software
Open Source Software
It refers to the software that is developed and tested It refers to the software that is solely owned by
through open collaboration. the individual or the organization that
developed it.
Ways to Contribute
There are many ways to contribute to open source projects, regardless of your
skill level or experience. Here are some ways you can get involved:
• Identify the project and community: First, you need to identify an open source
project that you are interested in and check if it has an active community. You
can use platforms like GitHub or GitLab to search for open source projects.
• Understand the project: Once you find a project, take some time to understand
its purpose, its goals, and its development process. Read the project's
documentation, explore the code, and look for issues that need fixing.
• Join the community: Join the community through the project's communication
channels, which may include a mailing list, a forum, a chat room, or a social
media group. Introduce yourself and ask how you can contribute.
• Pick a task: Look for an issue that matches your skills and interests. Start with
simple tasks, like fixing a typo, adding documentation, or writing a test case.
You can also ask the community for suggestions on what tasks to work on.
• Fork the project: Once you have identified a task, fork the project's repository
and clone it to your local machine.
• Make the changes: Make the changes needed to solve the issue. Use best
practices, such as writing clear commit messages, following the project's
coding style, and testing your changes.
• Submit a pull request: Once you are done with the changes, submit a pull
request to the project's repository. Describe what changes you have made and
why they are important. Be open to feedback and be prepared to make further
changes if needed.
Contribution Flowchart
References
<Add all the references>