GitHub

.
├── InputReflector.ipynb  # Code For InputReflector.
├── Siamese.py            # Code For Siamse Network and Quadruplet Network
├── adv_examples.jsonl    # part of Adv_examples to train Sia and Quad Network
├── model.py              # CodeBert Model
├── process_ga_csv.py     # Convert Alert's CSV file to adv_examples.jsonl
├── reveal_non_vul.jsonl  # Reveal Dataset, Non-vulnerable part
├── reveal_vul.jsonl      # Reveal Dataset, vulnerable part
├── test.jsonl            # Devign Dataset, testset
├── train.jsonl           # Devign Dataset, trainset
├── train.py              # Script to train SiamseNetwork and Quadruplet Network
├── triplet_loss.py       # Triplet loss used in training
├── utils.py              # Utils
└── valid.jsonl           # Devign Dataset, validset

Download Fine-tuned CodeBert Model First:

pip install gdown
gdown https://drive.google.com/uc?id=14STf95S3cDstI5CiyvK1giLlbDw4ZThu

Install Dependencies:

pip install wandb lightning pytorch transformers

Then, you can start to train Siamese Network and Quadruplet Network，By following commands

python3 train.py --ood_non_vul_file=reveal_non_vul.jsonl \ 
	--ood_vul_file=reveal_vul.jsonl \
	--pretrained_model=model.bin \
	--train_file=adv_examples.jsonl \
	--batch_size=32 \
	--model_type=sia                        #quad for Quadruplet Network

Program will automatically divide adv_examples.jsonl into a training set and a validation set，And upload metrics and checkpoints to wandb Server, I strongly recommend you to use wandb to manage these data.

Once the training is done, you can use InputReflector.ipynb to inference, see the comments inside to know the details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
InputReflector.ipynb		InputReflector.ipynb
Readme.md		Readme.md
Siamese.py		Siamese.py
adv_examples.jsonl		adv_examples.jsonl
model.py		model.py
process_ga_csv.py		process_ga_csv.py
reveal_non_vul.jsonl		reveal_non_vul.jsonl
reveal_vul.jsonl		reveal_vul.jsonl
test.jsonl		test.jsonl
train.jsonl		train.jsonl
train.py		train.py
triplet_loss.py		triplet_loss.py
utils.py		utils.py
valid.jsonl		valid.jsonl

bytezzz/CodeInputReflector

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages