Skip to content

jyanqa/aflt_project

 
 

Repository files navigation

Exploring the SoPa Model

A course project for Advanced Formal Language Theory class, Spring 2022 at ETH Zurich.

The project aims to explore the SoPa model, based on "SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines" by Roy Schwartz, Sam Thomson and Noah A. Smith, ACL 2018, which is characterized by the patterns matching mechanism.

Dataset

As in the SoPa experiment, 100 movie reviews from SST dataset with binary labels are also used in our ablation study.

Approaches

We executed the study on three levels, including automaton level, automata level, as well as on the SoPa as a whole. For reproducility, please refer to README_ reproducible_code.md file.

Findings

We conclude valuable insights regarding how parameters affect the model performance through multiple experiments.

  • First, increasing patterns’ width improves model performance while the length has little effect. Second, in case exclud- ing ε-transitions, the bigger max-steps-forward is, the better performance is yielded.

  • Also, there is a downward trend in test accuracy as we increase the value of shared_sl because it decreases the complexity of the model.

  • Third, the log-space max-times semiring achieves the best accuracy.

  • Finally, training the patterns on multiple documents improves the gradient com- putation and hence the convergence rate.

Contact

For questions, comments or feedback, please email quynguyen@ethz.ch

Code | Report

About

Course project for AFLT class, Spring 2022: Exploring SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.4%
  • Shell 18.6%