Friday 6 December 2013

CSC 148 SLOG for Week 12

         This week we learned how to make methods more efficient by avoiding redundant calls. I am pretty familiar with the fibonacci sequence but I have never heard about memoization, so it is a little hard to understand. However, even though the concept is completely new the algorithm is pretty clear in that it just checks whether a given calculation is already included in the cache and if so there is no need to run it. However, defining a function within a function sounds crazy for someone who came from a java background like me. Nevertheless, the concept is a little similar to a nested for-loop so after some careful contemplation the algorithm becomes pretty straightforward. We also learned how to make quicksort much more efficient without any worst case by randomizing the list that is going to be sorted which is pretty reasonable to me I already know that quicksort would perform really badly if the list already sorted. I found the lab this week very interesting in that it relates a graphical concept to a list and referring to each element needs some mathematical knowledge. It was a little challenging at first when I couldn't understand the structure of the tree, but then it became much easier since I understood the math behind it.
         Selection sort is a sorting algorithm that finds the minimum or the maximum element of a list and puts it at the front or the back and excluding it for the next recursion, and it always has an order of O(n^2). I know selection sort pretty well from my java class in high school and it has now become so much easier since we could just use the built-in max function in python.
         Quick sort is a sorting algorithm that finds a pivot value and one value that is bigger than it on the left and one that is smaller on the right and swaps and keeps running the recursion until the list is sorted. It has a best order of O(nlogn) but a worst order of O(n^2) since if the list is sorted it would need to go through the whole list n times, making it n^2. This inefficiency is why we should randomize the list at first which greatly reduces the chance of sorted parts of the list so that the order would approach the best case.
         Merge sort is a sorting algorithm that divides the list into elements, compares them and puts them in the right place. It has always been the most complicated sorting algorithm in my opinion since you have to use both recursion and for loops and check for many cases. The code I learned in high school was so long that the teacher didn't really want to talk about it. It has a constant order of O(nlogn) and is considered the most sophisticated and optimal algorithm for any type of lists.

         I agree with the author of http://csc148.blogspot.ca/ that Professor Heap's lectures are very helpful and I particularly like the challenging yet very intriguing assignments he gave. However, I don't think the tests were very long. As long you could figure out how to construct the recursive algorithms the allotted time was reasonable. The tests were pretty normal in my opinion. The questions were not particularly hard, but they do test a considerable amount of knowledge of the course material.

Sunday 24 November 2013

CSC 148 SLOG for Week 11

         This week we finally learned the python convention of information hiding, even though it is still not strongly enforced since it is only a "convention", unlike in Java the private keyword will ban the users from accessing certain data. I was shocked the first day of class by how casual and frivolous python's format and rules are. There are no braces, no variable definition. Everything is dynamic and defined at run-time. However, I gradually got used to it and the convenience it brings. I feel like python is way more user-friendly with its less rigid rules and more comprehensive library and it feels great to just code for fun. But still I heard that in the later years we are going to focus on java to make some serious programs with copyright. Each language has its merits, and I like all of them. Python trusts the users to be professional programmers who wouldn't be meddling around with unauthorized data, so it regulates legitimate programming to the specific field, which is suitable for our use since we are all programmers-to-be that would follow the rules. We also learned about the __eq__ and the __mul__ function and how you could override it to your own implementation, which is similar to the str and the __repr__ functions. I think I've learned a similar method to __eq__ in java but I can't remember it now. I do remember the toString() method in java which works the same way as str in python. We also learned about the reduce function which is something I've never heard before but is yet another one of those convenient and logically clear functions introduced in python. I've never thought of multiplication as reducing several numbers to a single number. That logic is intriguing to me. There is no homework this week. The lab this week is not really related to coding it just asks us to graph the efficiencies of all the sorting algorithms and do the same if the list is already sorted or the reverse of a sorted one. The cprofile was really useful in that it maps out the process of the algorithm and number of calls on the way so that in the future we can improve our code using this and fixing the areas where there are too many unnecessary calls. I could answer all of the questions posed in the lab such as why mergesort and quicksort would be faster if you directly change the list parameter. It is because the program does not need to spend time copying the list into another one. There was one result that did bother me. It was that when I tested the reverse case bubblesort being a worst case of n^2 the same as selectionsort, took 10 seconds while selection took around 6 I think. I was very confused of why this would happen. I thought bubblesort was a pretty efficient algorithm. Then I thought for a while and remembered that the reason why mergesort and quicksort are slowing if they need to copy is because of the number of calls, while the big-O is obviously still nlogn no matter you copy it or not. So this time tells you the time spent on calls of all kinds of methods, while the number of iterations needed for sorting is still the same. Therefore I thought to myself why bubblesort has too many calls. Then I realized it is because it swaps every single time (which is a call) it finds next element smaller than the element in question while selectionsort finds the max which is one call if using built-in python one and simply puts it at the back which is another. Even though the finding max part could use a lot of calls such as assigning a different value it is definitely more efficient than swapping every single time with changing two variables. In the end their big-Os are still n^2 but bubblesort takes longer.

         I agree with the author of http://cf148.blogspot.ca/ that the big-O of an algorithm is very hard to determine since it ties in the knowledge of limits which we learn in calculus. For example selection sort should have an order of n(n-1)/2 but in reality it is n^2 since when n gets large n-1 is n and the division by 2 doesn't make much of a difference in the face of infinity so we can ignore it. The other big-Os are even more related to intuition so it is pretty confusing at times.

Thursday 21 November 2013

CSC 148 SLOG for Week 10

         This week we didn't learn anything since we had a holiday on Monday and the second part of the midterm on Wednesday. And there was also no lab this Friday. There was also no exercise nor assignment. The midterm was pretty easy I think. I was pretty confident with binary trees and binary search trees and recursion so it was a piece of cake. I find it way easier than the first test, which required some more knowledge of built-in functions which I am not proficient with.

         I had the same experience as the author of http://tgslog.blogspot.ca/ about the assignment. When I first wanted initialize a RegexTree I tried to build it from the bottom up since it appeared to me as more natural and easier to code. However, I realized that from the top down would be much more optimal since we could count the number of parenthesis and we could always use the number of parentheses to check which symbol is the root and also recursion only works well from the largest scale to the smallest.

Monday 11 November 2013

CSC 148 SLOG for Week 9

         This week we learned about sorting algorithms and their efficiencies. I only learned selection sort, merge sort and quick sort in high school, and we didn't touch on the fancier ones such as bubble sort and Tim sort. I think that sorting is very algorithm-oriented, therefore as long as you can follow the recursive steps you should be able to understand sorting pretty easily. However it bedazzles me how the inventors of these sorting algorithms came up with it since it is so clear and straight-forward when you read it but it's a completely different story when you want to devise one yourself. They also have to consider about efficiency and when is the optimal time to use their sorting method. I also feel the same way when I am studying mathematics. The proofs of the theorems or the ways to solve a problem all appear pretty easy but when you are the first one to come up with it they would become so difficult. There is no exercise this week. I finished the assignment the week before so I just had to submit it. However my teammate did find an error with the code so I went back and analyzed it. It was a really stupid mistake. I had use the length minus the iteration variable for checking the divisibles instead of length minus one. The lab this week was also pretty straight-forward in that it tests our knowledge about linked lists.

         I agree with the author of http://csc148courseblog.blogspot.ca/ in that stars are very hard to handle in assignment two since they do not have parentheses accompanied with therefore making the checking procedure more complicated in part one. However I found a good way around it which is that you check whether at that symbol "*" the number of parentheses is zero and I checked from the back since stars are always the last character of a regular expression and it saves some checking time this way. The star would become more of a nuisance in part two where I didn't know how to partition the string for checking since star would make the number of characters indefinite. In the end I just checked for every single group of characters divisible by the length of the string on the left of the star.

Sunday 3 November 2013

CSC 148 SLOG for Week 8

         This week we learned about binary tree lists that are arranged in the order of the left subtree being smaller than the root and the right subtree being larger than the root. We looked at how the insert, find, delete and find max methods work using recursion, and in the exercise we are supposed to implement a contains and a max node method using iteration following a model of the insert method in iteration. I find material we learned in class this week pretty easy since we've already learned binary tree extensively. We also learned about searching and sorting efficiency, which I still feel a little unclear about. I learned about big-O notation and efficiency in high school but we didn't go into too much detail of it, so I read the lecture slides carefully and listened carefully in class and now I think I have a good understanding about the notation. I think it is a bit similar to limits, which I learned in calculus. When n get very large, the efficiency approaches n or n^2 or log(n) or nlog(n) which when I drew a graph of the efficiency versus n of the first few points seem pretty reasonable to assume if n approaches infinity. This concept once again confirms the fact that math and computer science are very related and I could use some math concepts such as graphing to help me solve hardships in computer science. The exercise this week was enough since I only needed to follow the model of insert to implement the methods contains and max node. I spent a long time on the lab this week but it was solely because I misunderstood the question. After rereading it I just thought for a while and knew how to do it. I finished the assignment this week and find this assignment just as interesting as the last assignment. Even though this one is not as complicated and long as the last one, it is pretty challenging to understand the concept of regular expression and comparing it with string. Unlike the professor, I actually found the first part quite easy since I only needed to turn a string into a tree. I used the method of checking the number of parenthesis and working my way from the top down, finding the symbol for the root every time and recursively initialize regextrees as the left and right children. However I did encounter a lot of trouble on the second part since I was unsure whether I should check from the top down or the bottom up. The major source of difficulty was that I didn't know where to partition the string to check for for example the left child of a dot root. In the end I used a for loop to check for every single partition and every single number of elements. Then everything flowed smoothly and I implemented the bar, which is very similar to the dot and the star.

         I agree with amber from http://amberlikescode.blogspot.ca/ in that binary search trees and binary trees are super useful and even though they are not built-in data types they become natural after some practice. My roommate told me that microsoft actually asked him about binary trees during an interview.

Sunday 27 October 2013

CSC 148 SLOG for Week 7

         This week we continued to learn about binary trees and nodes and how to implement a tree using a linked list. I have studied linked list before and it was pretty much the same in java as in python. I find linked list very convenient because we only need one instance variable to represent the whole list. The exercise this week was a little painful for me since I kept misunderstanding the instructions and I also forgot to check what internals and leaves referred to, so that I always got errors reported from markus. It was a pretty fun exercise, though it did waste me a long time correcting my stupid mistakes, and it wasn't as hard as last week's which I really enjoyed. However, I did start doing the assignment this week and find it very intriguing. I always like case studies and classes that are not built in such as the grid world case study from eclipse. I think this way it would compel us to think more comprehensively and diligently since there are no convenient library methods ready for use. And this assignment is another typical example of a logical problem that does not require much code. I finished the first part but I'm still stuck on the second because I just can't figure out how to check the RegexTree from the bottom up and how to reference to the corresponding characters in the string. I already drew a detailed tree and analyzed different scenarios but I still can't solve it. I want to leave it for now and maybe ask the professor next week. I feel pretty confident about the material covered this week because I think I have a pretty good logic but not a very good memory, so binary tree is fairly easy to me. The lab this week was also about linked lists which I found no difficulty in. This week I also received my grade for the first assignment and I received a 70 something with a class average of 60 something. I have to say that the grading was pretty harsh since it was already very hard for me to finish everything and make it work while looking at the rubric I saw that you would only get half the credit if you just finish it. I find that unreasonable since not everybody can think in a concise and efficient way now that we are just beginning to program and as long as I was not hardcoding I should get a higher grade. But I did enjoy the assignment a lot and this rubric prompts me to be more efficient in writing code. I recall the exercise last week where I used the maximum of a list of tuples in a list comprehension and feel like I am able to write efficient code already. It's just that large assignments are more daunting so that I can't really think of brevity. I need to pratice and think more about which algorithm is the most efficient. I read the blog by bearw on http://obiwanslog.blogspot.ca/ and felt the same way as he did. I, too, spent a long time debugging because there was just too much to implement but when I got to tour of Anne Hoy and SolvingController it flowed really quickly because I was immersed in the fun of logical thinking. The last part required a lot of thinking and little coding.

Sunday 20 October 2013

CSC 148 SLOG for Week 6

         This week we didn't learn about anything since monday was thanksgiving and we had midterm on Wednesday. The midterm was okay in terms of difficulty but the sad thing is since I skipped CSC108 I am not familiar with how some of the functions run. For example, I got the first question completely wrong because I thought the minimum of a list of tuples will give you the tuple with the least element but after the midterm I realized that it gives you the tuple with the least first element and if the first is the same for some it looks at the second. I need to work harder with familiarizing myself with python, though I do feel more confident now after several weeks of practice. I should read documentations more often. Moreover, I find the exercise this week incredibly fun. This is the hardest exercise so far and I really enjoyed doing it. Before the exercise I already knew quite well the difference between the three traversals, however I never thought building the original tree from the traversals would be way harder from the other way around, since you don't have a clear sense of division between the root and the left subtree and the right subtree now that everything is in one list. But with two traversals you can find the root from one then the left and right from the other and keep using recursion to order everything. I also remembered to handle the exceptional cases where the left or right subtree is none. I felt a great sense of accomplishment after I finished it, and part b is just a piece of cake compared to a. This whole concept of trees make me reflect on some of the most interesting topics I learned in java. I find trees very different from what we've learned so far since it is not actually a built-in class. It is a useful concept defined by convention. When I studied java before I felt like searching and sorting was very interesting since it was pure logic and there was no existing resource at all, just like trees in python.