epochx / Commitgen
Code and data for the paper "A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes"
Stars: ✭ 53
Labels
Projects that are alternatives of or similar to Commitgen
Aws Machine Learning University Accelerated Cv
Machine Learning University: Accelerated Computer Vision Class
Stars: ✭ 1,068 (+1915.09%)
Mutual labels: jupyter-notebook
Gan
python notebooks accompanying the book Make Your Own GAN
Stars: ✭ 50 (-5.66%)
Mutual labels: jupyter-notebook
Data Privacy For Data Scientists
A workshop on data privacy methods for data scientists.
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Targeted literature reviews via webscraping
Web scraping to get articles for a given query. It returns an spreadsheet with titles, abstracts, doi and references of the article
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Aistudio Searching Data Dumps With Use
searching large heterogenous data dumps with Universal Sentence Encoder
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Policy Gradient Methods
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC
Stars: ✭ 54 (+1.89%)
Mutual labels: jupyter-notebook
Handwritten Character Recognition
This a Deep learning AI system which recognize handwritten characters, Here I use chars74k data-set for training the model
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Transformer Tts
Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Keras2kubernetes
Open source project to deploy Keras Deep Learning models packaged as Docker containers on Kubernetes.
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Figure Gen
A Python package to effortlessly assemble images in comparison figures. Supports LaTeX, PPTX, and HTML.
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
365datascience
This Repo Contains all the exercise files for Data Science Course of 365 Datascience . The repo is split into the relevant folders & there is one exercise folder which contains all the files of that course. Don't forget to star it :D
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
25daysinmachinelearning
I will update this repository to learn Machine learning with python with statistics content and materials
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Tensorflow Tutorials For Time Series
TensorFlow Tutorial for Time Series Prediction
Stars: ✭ 1,067 (+1913.21%)
Mutual labels: jupyter-notebook
Fasttext multilingual
Multilingual word vectors in 78 languages
Stars: ✭ 1,067 (+1913.21%)
Mutual labels: jupyter-notebook
Brihaspati
Collection of various implementations and Codes in Machine Learning, Deep Learning and Computer Vision ✨💥
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Stock Market Prediction Using Natural Language Processing
We used Machine learning techniques to evaluate past data pertaining to the stock market and world affairs of the corresponding time period, in order to make predictions in stock trends. We built a model that will be able to buy and sell stock based on profitable prediction, without any human interactions. The model uses Natural Language Processing (NLP) to make smart “decisions” based on current affairs, article, etc. With NLP and the basic rule of probability, our goal is to increases the accuracy of the stock predictions.
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Homeless Arrests Analysis
A Los Angeles Times analysis of arrests of the homeless by the LAPD
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
Visualizing And Understanding Convolutional Neural Networks
Stars: ✭ 53 (+0%)
Mutual labels: jupyter-notebook
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
-
Requirements
- Torch, Cutorch (http://torch.ch/docs/getting-started.html)
- Python packages unidiff, pygments:
pip install unidiff pygments
-
Setup environment
- Clone this repositoty:
cd ~git clone https://github.com/epochx/commitgen-dev.git - Create data path:
mkdir ~/data/preprocessing - Export env variable:
export env WORK_DIR=~/data(without trailing slash!)
- Clone this repositoty:
-
Download our paper data:
- Get the raw commit data used in our paper from https://osf.io/67kyc/?view_only=ad588fe5d1a14dd795553fb4951b5bf9 (click on "OSF Storage" and then on "Download as zip".) Unzip the file where convenient.
- Unzip the desired dataset zip and move the resulting folder to
~/data.
-
Pre-process data
- Parse and filter commits and messages:
cd ~/commitgenpython ./preprocess.py FOLDER_NAME --language LANGUAGE, whereFOLDER_NAMEis the name of the folder from the previous step. Add the '--atomic' flag to keep only atomic commits. This will generate a pre-processed version of the dataset in a pickle file in~/data/preprocessing. Trypython ./preprocess.py --helpfor more details on additional pre-processing parameters. - Generate training data:
cd ~/commitgen./buildData.sh PICKLE_FILE_NAME LANGUAGE(PICKLE_FILE_NAMEwith no .pickle).
- Parse and filter commits and messages:
-
Train the model 1.- Run the model
cd ~/commitgen./run.sh PICKLE_FILE_NAME LANGUAGE(PICKLE_FILE_NAME with no .pickle)
You can also dowload additional github project data by using our crawler do cd ~/commitgen and run python crawl_commits.py --help for more details on how to do it.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].
