Multiple Kernel Learning for Extracting Protein-protein Interactions

For this semester’s Bioinformatics course, I have applied various multiple kernel learning algorithms on extraction of protein-protein interactions from biomedical literature. Below is a more formal presentation of the project and you can also reach the report.

Determining if a protein interacts with another protein is quite important as it provides important clues in many research areas such as development of new drugs. One approach to PPI extraction relies on machine learning and natural language processing methods to learn models discriminating between positive and negative interactions based on linguistic features of sentences. In this project, we apply multiple kernel learning algorithms (rule based, alignment based and mkl by Bach et al.) on 3 kernels (shallow linguistic, subtree, k-BSPS) that use different linguistic features and analyze the results from accuracy and kernels' importance perspective.

Report
Source code is available on request due to its large size.