Automatic Bug Localization
- The goal in automatic bug localization is to find bugs in a code base.
- I use Information Retrieval (IR) techniques to build a software search engine.
- An IR based bug localization system finds ranked list of relevant source code files.
- The traditional way to do bug localization research is by using what's famously called a Bag-of-Words (BOW) approach.
- In a BOW approach only the frequencies of terms appearing in bug reports and source code files are considered in the model.
- And the position and ordering of words or terms have no meaning.
- This is a problem since words in any language, be it programming language or natural language, have contexts.
- Therefore, a term-term dependency model is needed which imposes positional and ordering constraints.
- I use Markov Random Field (MRF) based approach to enforce position and ordering constraints in IR model.
- If you are interested in knowing how MRF can be used to build a search engine, you will enjoy reading our MRF paper.
- For datasets you can refer to the BUGLinks dataset or moreBugs dataset.
- Source code for this project is in Java programming language.
3D Modeling of Dormant Fruit Trees
- Creating a very accurate 3D model of fruit trees is a challenging task.
- The ultimate goal of the project is to automate the process of pruning fruit trees.
- For this precision agriculture task a 3D model of the tree is required to locate candidate branches to prune.
- For details about this project please refer to the project webpage here.
- If you want to play with Kinect2 depth images of indoor and outdoor dormant trees, you can download the datasets here, here, and here.
- If you use these datasets please give us credit by citing our works.
- Collecting depth images of outdoor orchard trees was quite an experience! That too in snow!
- Most of the source code is in MATLAB and C++.
Other short term projects
- The goal is to classify reviews as positive or negative.
- Following methods were implemented in Python and evaluated for the task
- Decision Trees, Bagged Trees, Random Forest, Boosted Trees
- Support Vector Machines, Logistic Regression, Naive Bayes
- SVM seems to outperform all other models for this dataset.
- Ensemble methods like Boosted Decision Trees and Random Forests are also very good.
- Naive Bayes and Decision Trees do not perform very well.
- Source code with dataset
- Implemented Convolutional Neural Network for MNIST Classification in MATLAB.
- Got 98% accuracy which is expected for MNIST classification using CNN.
- The goal is to cluster the digits.
- Implemented k-means clustering in Python.
- Also played with sklearn's agglomerative clustering methods.
- Source code with dataset
- PCA and LDA implemented in MATLAB.
- LDA converges earlier than PCA.
- FacePix dataset used for experimentation.
- Implemented Zhang's algorithm for finding intrinsic and extrinsic parameters.
- Implemented Harris corner detector.
- Used SIFT and SURF implementation of OpenCV in Python.
- The goal is to route traffic using a max flow min distance algorithm.