My Machine Learning Learning Experience (Part 1): UD120 Lesson 1-7
This semester I've been learning Machine Learning, which has been a disaster so far. I don't understand what the prof's sayin', and sometimes I have to skip class for job interviews (so it's partly my fault too), it's getting worse day by day. I started learning stuff in Udacity since last September and it really helps me a lot. In fact, I survived last semester thanks to Udacity. Therefore, I turned to Udacity again by checking out the course taught by Sebastian Thurn and Katie Malone in early February. It's been a delightful journey, so I decided to jot down what I've learnt and my opinions to the experience too. If you're thinking about learning Machine Learning but you're in the same situation like I am or there're too many resources available in the Internet you don't know where to start, reading this may help.
February 22, 2016 (Mon)
Finished Lesson 1 and almost finished Lesson 2. Binge-watching Arrow.Yes, it took me a whole week, sort of. I just started this course so I didn't take this seriously, not yet, but when someone's tryin' to teach you machine learning on a self-driving car it would certainly get you hooked. I really like how Thurn explained Bayes Rule. I was so excited that I went back to the lecture notes I got from CUHK to see if I understood any. That was when I noticed this course is more of a hands-on workshop instead of doing it like a traditional course that tries to teach everything back-to-back. I knew there's another machine learning course in Udacity which was built by Georgia Tech, and it was taught by Michael Littman too, I kinda like this guy's teaching styles. I was thinking about checking out this course too. Regarding SVM part, I think I had grasped all the basics and know how it works on the surface, but I also understand I still don't fully know the magic behind it. I may need to dig deeper myself.
Also, I don't know if they intended to do that or not, they didn't explain what gamma parameter was in the lesson. That question came into my mind after they explained what Cost parameter was. I'm the kind of guy that likes to leave no stone unturned, so I did some Google search to find exactly what gamma does. However, one thing led to another, I found the course's playlist on YouTube and found they actually did a video on gamma parameter:
March 29, 2016 Update:
You can check out my code here (Lesson 1 - Naive Bayes):
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage01_Naive_Bayes
February 23, 2016 (Tue)
Finished what was left of Lesson 2 in Intro to Machine Learning and moved on to Lesson 3, 4 and 5. Binge-watching Arrow.Lesson 3 is about Decision Tree. I already learned the basics during the last semester thanks to the lecture videos offered by UC Berkeley.
All I did was flipping through some of the notes I wrote to refresh myself with information gain and the general steps on how to construct a decision steps manually and I was all set. The whole lesson was like a whirlwind to me, still fun though.
Lesson 4 requires you to explore an algorithm on your own. That was new to me. I've never taken any CS course in CUHK that required me to explore something on my own and explained it, I kinda liked the feeling of that. They offered me three algorithms to begin with. I only needed to choose one of them, but thanks to the simplicity of SKlearn, it was hard to resist not writing all three. I started with AdaBoost, which was an amazing algorithm, and not hard to understand too, thanks to this guy's great analogy in Quora (link). Then I moved on to 'k-nearest neighbors' and 'random forest', and coded them up in about ten minutes. This is my favorite lesson so far.
Lesson 5 is still about Machine Learning, but it pays more attention to the Enron dataset itself. Therefore, most of the exercise are just trying to help you understand more about what kinds of data you got and how they are arranged. As long as you have a basic understanding of Python you'll be fine. One of the exercises took me quite a while though. They wanted me to check how they denote the data when certain people's email address or salary is not available in the dataset and then tell them how many. A quick check will tell you they use NaN (Not a number). So I wrote a simple loop to count them. On the salary part, it was easy, check use math.isnull() and you're done. But for the email address part, not so easy, because I was tricked. I used pandas and numpy to check if the email address string is NaN or not, but both didn't work. Turned out they literally wrote the string "NaN" when data wasn't available, which took me ten minutes to find that out. So all I needed to do was check if the string equals "NaN", damn...
March 29, 2016 Update:
You can check out my code here:
Lesson 2 - SVM:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage02_06_SVM_Rbf_Kernel_Predict_Some_Elements
Lesson 3 - Decision Trees:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage03_04_Email_Preprocess_Percentage_Rollback
Lesson 4 - Choose Your Own Algorithm:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage04_03_K_Neighbors
Lesson 5 - Datasets and Questions:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage05_08_Total_Payments_POI
February 24, 2016 (Wed)
Almost finished Lesson 6, stopped at the mini project and started writing this whole blag. Finished SL1 and SL2 in Machine Learning. Still binge-watching Arrow (Never knew there was a Alpha-Omega virus outbreak in Hong Kong, I should have fled earlier.).Lesson 6 is about Regression. The whole curve-fitting idea is kind of intuitive, so I picked up pretty fast. I think I also understand the whole least square error thing too, and I came across a new term R-squared. I don't know how it works, all I know is it's a value between 0 and 1, the higher the better, and it's independent of how much training data we got. Knowing that I won't learn anymore theoretical and working principles stuff that universities usually teach you, I finally took a detour and watched the course built by Georgia Tech. The first lesson SL 1 is about classification and decision tree. I think I already know enough, thanks to my notes and Thurn and Malone, so I watched them in 2x speed. I think Malone taught Decision Tree better, especially for people who are not familiar with the concepts of Information Gain and Entropy (I learnt them from some notes in UW last semester, but I already lost it, sorry.) Then I moved on to SL 2, which taught me more about Regression (finally!). And again, thanks to Thurn and Malone, I caught up really fast with the basic concepts I just learnt the same day. What I learnt was exactly how polynomial regression was done, pretty cool!
Feeling like I already learnt a big deal, I decided to take a rest and go back to my Statistics study. Did I tell you I'm very rusty in Statistics that I'm surviving on the stuff I still remember back in high school?
February 25, 2015(Thur)
Finished what's left of Lesson 6 and finished Lesson 7. Finished Arrow Season 3.I spent the whole day in ASM for a second interview and annoying my colleagues. Then I went back to CUHK to attend my FYP meeting (and finished the season finale of Arrow Season 3 on MTR), so there wasn't really not much I could get done today. I finished the mini project in Lesson 6, which was simple yet awesome. Then I was in a dilemma between starting Lesson 7 or starting to work on some MongoDb stuff (for my FYP). After five minutes of debate inside my head, I said screw it and went for Lesson 7 (Clearly I wasn't in the mood for database stuff).
Turned out Lesson 7 was not that hard. In fact, it was just another bunch of Python exercises that help you understand the effect of outliers at the same time. It was really fun, and understanding outliers is not that hard (for now, I haven't taken a look into textbooks yet)
On the other hand, I went back to the lecture notes from my university yesterday and found the prof spent four slides to deduce linear least squares. It sounds a little bit confusing, don't worry, I'm gonna explain. So after finishing SL2 yesterday I wanted to learn more about how to minimize the error on polynomial regression. So I flipped through the text book and lecture notes I got, but I came across linear regression instead. Well it wouldn't hurt to take a look.
March 29, 2016 Update:
You can check out my code here:
Lesson 6 - Regressions:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage06_03_Sneak_Peek
Lesson 7 - Outliers:
https://github.com/kevguy/Intro-to-Machine-Learning-ud120/tree/Stage07_05_Remove_A_Outlier
February 26, 2015 (Fri)
Went back to the lecture notes of my course.I decided to head back to the books today to see what I missed. Parametric methods? Shit, never heard of that. The first thing I came across was Maximum Likelihood Estimation, never heard of that either. I still couldn't understand what the book was talking about even after reading it five times, and then again, Udacity came to the rescue. Turned out Maximum Likelihood Estimation is a really simple concept, I was just razzle-dazzled by all those formal words (Mathematicians...right?). Just watch this video and you'll know what I mean:
After deducing the maximum likelihood estimates (estimators? estimates? whatever) for Bernoulli and Gaussian random variables, I called it a night and went to bed. For the details, you can refer to the examples in this link. I was totally lost when my prof was explaining linear regression and I was even more confused when I tried to read the textbook, but thanks to this course, I finally understand how it works and I deduced the least square error fomulae for polynomial with deg(1).
February 27, 2015 (Sat)
Started Lesson 8. Started binge-watching The 100. Man do I love TV!I woke up at five today, I was really tired yesterday so I went to bed early. I was gonna go to Master's home to hang out in the afternoon and let him teach me some sql stuff too, so I got like six hours to kill. Wasn't feeling like Machine Learning, I checked out the Intro to HTML and CSS course first to refresh myself a bit. That was really fun and interesting, you should really check it out. Then I started Lesson 8.
Clustering sounded like a simple idea thanks to Thurn and Malone, however, they didn't explain how to iterate the steps to make the results better, instead, they only showed the results in a short video. Well, I'm already at Lesson 8, so I know the deal. I was pretty sure the mini project would be fun and easy as usual, but I wanted to know more about the principles first before moving on, I think I'll do it next week.
Then I checked my texts and Janice told me the first Machine Learning programming assignment (in my university's course) had been released. I finished half of it in an hour. I could on;y do that thanks to Malone and Thurn did such a good job designing the mini projects and let me practice. Hope I can share my work with you soon! I stopped at the part where I have to do polynomial regression, which was only covered a little by Littman's course. Littman proved the formula in this video informally:
But the textbook took a quite different approach and deduced the formula in more formal way. I tried to do it myself but I don't understand some of the assumptions and the logic. Feeling helpless, I left home for Master's and spent the rest of the day working on my FYP (things started to get tense between Angus and me again, I have to do more work to show the prof I'm trying my best) and being tempted by Master to play video games...
Kev
My Machine Learning Learning Experience (Part 1): UD120 Lesson 1-7
Reviewed by Kevin Lai
on
1:51:00 PM
Rating:
No comments: