Machine Learning is NOT Magic
- lucreceshin
- Nov 19, 2020
- 2 min read
Updated: Jan 9, 2021

During the early stage of my experience in Machine Learning, my problem solving philosophy was to focus on modelling and careful selection of hyperparameters. Applying complex architectures from state-of-the-art research papers in deep learning seemed to work well. Through time; however, I've come to shift my attention to a different component in Machine Learning pipeline : Data.
What humans "learn" come from their environment. Situations they were exposed to, things they saw, people they met shape them. This must apply to a machine learning model as well. The type and quality of input features, along with the syntax of the labels determine a model's performance. How is the data distributed? How can I augment the data so that the model becomes more robust? Are some features not informative and only contributing to noise? How can I extract the best features? What is the number of features that should be most informative together? How can I formulate the label such that the model can learn most efficiently? What kind of loss should I use for optimization with this type of features and labels? These are all important questions I ask myself while optimizing the performance.
It is also important to ask such questions iteratively. Formulate features and labels, put them through a model and observe the results. Which aspects are the model most confused about? Will less number of features reduce noise? During my image domain adaptation project using CNNs, I saw poor classification accuracy of one particular class after first try. Although my first instinct was that the CNN model architecture was not complex enough, I checked the distribution of input images using t-SNE. The result was that the class with the poor accuracy actually had some overlapping distribution with another class, resulting in a sub-optimal decision boundary. I fixed the problem by scraping some more images of the class that contain unique characteristics of that class.
The more I focused on the data, and not the model, the "black box" of a Machine Learning model started to become more and more transparent. A machine learning model is a "function". It should not be too different from y = f(x) in grade 11 math class. It's just that x, y, or both might be much higher-dimensional, thus possibly incurring a more complex relationship. Also, as I've done with a simple y = ax + b function, I can't solve for the weights on a sheet of paper. There might be hundreds of them, if not millions. So I let the computer solve for the optimal weights. But these weights are just the numerical output of what I told it to do. I've given the data it should use to map to a particular form of "y" labels which I also formulated and gave to it. I'm responsible for setting up the environment and props. The model and the computer is only the computing hardware.
My new ML philosophy, backed up by hours of personal experience through trial and error, is to analyze and optimize data iteratively.



Comments