Sunday, November 6, 2011

Big O notation

Classification of algorithms running times given input X as X grows very large.

Smallest to Largest:
O(1) Constant
O(log n) Logarithmic
O(n^c) Fractional power, 0< c < 1
O(n) Linear
O(n log n) Linearithmic, O(n log n!)
O(n^2) Quadratic
O(n^x) Polynomial, x > 2
O(c^x) Exponential, c > 1
O(n!) Factorial


Logarithmic Running Time, an explanation

When computing the running time for some set of operations, how do you know whether something is log base 3 or 2 or 10. I will refer to log base x as log_x from here on out.

A set of operations will be said to take log_x steps if you're decreasing the work by a multiplicative factor. Take merge sort for example, we divide the list by 2, so that operation takes log_2 n (n being the size of the list). So if you're dividing the work by a constant amount each time, you end up with a logarithmic equation.

Also, all logs can be reduced to log_2. Why and how you might be wondering since you've probably noticed most running time of algorithms is always written in log_2.

That is because you can convert between any of the logs. They are all the same within a constant factor.

Equation:
log_b (n) = log_x (n) / log_x (b)

Example with converting from base 2 to 10:
log_2 (n) = log_10 (n) / log_10 (2)

log_2 (n) = log_10 (n) / constant


Real example of why a function's running time is logarithmic:
Take a list with 128 elements, apply merge sort to it. With merge sort, you keep dividing by 2:
128
64 64
32 32 32 32
16 16 16 16 16 16 16 16 16 16
..
2...2

At each step you process all n numbers, and there are 7 steps you have to do to reach the final answer, so that's where you get log_2 128 = 7

What the algorithm looks like if mapped on an x, y axis:
The logarithmic function measures the rate of growth. It is the inverse of the exponential function. As a result, it is a slow growth function (opposite of an exponential function).

An exponential function's graph is a U split down the middle with the origin at the bottom of the U starting at (0,0). We care only about the +x, + y axis. Since the logarithm function is the inverse, its graph is a U turned on its right side. Half of the U is on the +x, +y axis, the other on the +x, -y axis with the origin (0, 0). You can tell for very large input, growth will be slow.

Inverse relationship example:
log_2 128 = 7
2 ^7 = 128

Properties of Sorting Algorithms

Run time complexity:

Amount of steps/number of operations required to complete the algorithm.

Performance:

How fast can the algorithm run? This is a bit different then running time complexity because here you care about how long an operation takes. For example, if you need to look up something in a DB to complete your sorting, because let us say a list only has ids but you want it sorted according to names as well - looking up those names is expensive and takes time.

//how long it would take to carry out 10 operations of varying lengths:
10 operations of x (each x takes 10 ms) = 10 x 10 ms = 100 ms
10 operations of y (each y takes 100 ms) = 10 x 100 ms = 1000 ms

Another topic that falls under performance is cache locality. Whether you're frequently accessing similar items while you're processing a list is important, especially when your list is large and does not fit into memory. If you're constantly thrashing and moving things in and out of memory, you have bad cache locality. If you're working on a subset of a list, and then move on and work on the next subset without constantly thrashing (for example constantly traversing the entire list), you have good cache locality.

Good cache locality: quick sort
Bad cache locality: heap sort, merge sort (if not done in place)

Storage (in-place vs not):

Extra storage needed to hold things while sorting a list.

An algorithm is in-place if it does not need extra storage space while sorting the list (minus a few variables). For example, while the algorithm is sorting, it does not need to create a new list to put the sorted elements into, it can do the sorting and placing of elements in their proper place directly on the existing list.

Stable vs Non-Stable:

An algorithm is stable if it does not change the relative position of items that are equal.
So after sorting a list, the relative position of items that are equal stays the same guaranteed.

Sort the list below by GPA (high to low), input List below:
[(Anna, 4.0), (David, 4.0), (John, 2.0), (Kelly, 3.0)]

Stable should produce:
[(Anna, 4.0), (David, 4.0), (Kelly, 3.0), (John, 2.0)]

Relative position of Anna and David are maintained, Anna still comes before David.

It should not produce (while non-stable algorithms could produce this):
[(David, 4.0), (Anna, 4.0), (Kelly, 3.0), (John, 2.0)]

The sorting is correct, but the relative position of David and Anna is not the original.

You care to chose a stable sort if you care about the current order (because it has some property you want to preserve) of the list you're about to sort.

So like the example above, if I have a list sorted by alphabetical order, and now I want to sort by GPA while keeping the alphabetical order sorting. I'd want to chose a stable algorithm.

Why not just sort and just take into account the various properties? Who cares which algorithm you use? Because it might be expensive to look up the data to attain the exact sorting you want. Let us say you have a list with ids and GPAs only. You want that list sorted by alphabetical order of names, even though names aren't included. If the list is presorted alphabetically, you want to retain that sorting. Because if you do not, then you have to do a db look up (which is slow) to get the name to factor it into your sorting equation when doing the sort.

Saturday, November 5, 2011

ClOsUrEs!

Tell me about closures, mentioning something other than their cool sounding name!

It is where functions access non-local variables, variables outside of their scope, and the functions get used at some later point in time (potentially outside of their scope). It is as if you're wrapping the function code and the reference to the outside variables in a sack for later use.

The typical scenario is when embedded functions (functions that are within a function) access their parent's function's variables (aka Lexical closure). Let us say that embedded function is then returned to be used later on. That is closure. That function is associated with state which is bound to the variables outside of its scope to be used later on. A closure is created that can be used later in the code that is made up of that function code and a reference to the outside variables it is accessing.

I guess closure got its naming due to the function capturing not just its input but the input around it (the non local variables), as if its surrounding it and closing in on it for later use.

Some languages have this built in, other's don't.

Some languages require the non-local variables be in the parent's scope (lexical closure), other's dont.

//Closure example, pseudo code
function A () {
count = 0;
function B() {
count++;
print count;
}
return B;
}

A closure is created that consists of B's code and a reference to the variable count. You can capture the closure for later use. Since a closure will be created from the code above (count and B()), that closure will not be on the stack because it will have to live outside of the lifetime of the function that is creating the closure (function A).

This is different from static in Java/C++ and global variables in C++, because here each time you access the function you're not reusing the same count variable, each function has its own count variable.

Closures are usually stored on the heap. You could do an optimization that if a closure's lifetime is within the function that created it, you could keep it on the stack. But in general that's not the case.

Languages such as Java don't have closures built in. In these languages, you'd have to pass outside variables into functions that want to use them. Actually in Java, you can't just have embedded functions like that. You'd have to have an object that invokes them. How to implement closures in Java is another post all together.

Event-Driven Programming

What is event-driven programming?

A programming concept where the program flow is based on events. The main way it works is that an event or an action takes place and a specific response is generated based on that event.

Many GUIs are event-driven and have a main UI loop that runs and receives events, such as mouse clicks, and then processes them.

Another example is interrupts (a signal to a component that indicates a change or response is needed). One example is hardware interrupts such as powering on/off a computer.

Asynchronous APIs and function callbacks could be used to implement event-driven programming.

Wednesday, November 2, 2011

Iterator Pattern

A design pattern that uses an iterator to traverse the elements of some collection without exposing its underlying implementation.

Examples:
Iterators can be used to traverse linked lists, trees, maps, etc. Same interface, different underlying implementation.

Interface for Iterator/Main methods on Iterator:
hasNext()
next()
remove()

Sudo vs Su

su: a command line tool that allows you to switch users. It actually stands for "switch user".

sudo: a command line tool that allows you to run a command as the super user/root.

Gang of Four

Interviewer: Who are the gang of four?
Interviewee: A music group that plays really awesome songs!
Interviewer: FAIL
Interviewee:

Don't let the above happen to you : )

The "gang of four" are 4 authors that wrote a famous design pattern book called "Design Patterns: Elements of Reusable Object-Oriented Software"

Wednesday, October 26, 2011

Mutex vs Semaphore

Mutex: Mutual Exclusion Lock. Allows only 1 thread at a time to enter the protected area/controlled section. Once a thread has obtained a lock, other threads will wait until that thread exists the controlled section and releases the lock. The thread owns/obtains the lock, and only that thread can release it. Only one thread can own a lock a given point. Locking mechanism to control access to a resource.

Mutexs help prevent race conditions. The example usually given here is depositing money into a bank account.

A mutex is essentially a semaphore with value 1, except it differs in how threads "own" the lock.

Semaphore: Allows up to N threads to enter the protected area. Threads increment/decrement the semaphore when they enter the concurrent area and when they leave. So multiple threads can own the lock. More like a signaling mechanism with keeping track of count to access the shared resources.

Producer/Consumer example.


Object in Java

In Java, it's the root class. All objects implement the methods of this class.

clone(), creates and returns a copy of the object
equals(Object), tests whether the two objects are equal to each other.
finalize(), garbage collector calls this method when there are no more references to this object
getClass(), returns the runtime class of the Object
hashCode(), hash code value of the Object
toString(), string representation of the Object

Thread related functions:
notify(), notify/awaken a single thread waiting on this object
notifyAll(), notify/awaken all threads waiting on this object
wait(), causes the current thread to wait until another thread invokes notify
wait(timeout), causes the current thread to wait until another thread invokes notify or the timeout is up

Queue, Deque, Stack

Stack (LIFO, Last In First Out)
In Java, the Stack class extends Vector.

Deque (aka Double-Ended Queue), objects can only be added to or removed from the front or back. Doubly linked list is a good implementation structure for this.
In Java, the Deque class extends Queue.

Queue(FIFO, First In First Out)
In Java, the Queue class extends Collection.

LinkedList is a good data structure to implement all of these.

Methods on these classes
push(Object)
pop()
clear()
isEmtpy()
size()

Thursday, October 20, 2011

List in Java

LinkedList, ArrayList, Vector, and Stack all implement this interface.

Important methods on this interface:

add(Object)
add(index, Object)
addAll(Collection)
addAll(index, Collection)
clear()
contains(Object)
containsAll(Collection)
equals(Object)
get(index) //return object
indexOf(Object)
isEmpty()
size()
iterator()
subList(index, index) //return List
toArray()
iterator()
set(index, element), replace specified postion with the passed in element
retainAll(Collection)
removeAll(Collection)
remove(Object)
remove(index)
lastIndexOf(Object) //return index of the last occurrence of this object


ArrayList:
clone()
toString()
removeRange(fromIndex, toIndex)


Vector:
clone()
toString()
elementAt(index)
setElementAt(Object, index)
firstElement()
ElementAtIndex(index)
removeAllElements()
removeRange(fromIndex, toIndex)


Stack, last-in-first-out (LIFO), extends vector
empty()
peek()
pop()
push()
search(object)


LinkedList //implements Deque Interface
descendingIterator - Returns an iterator over the elements in this deque in reverse sequential order.
removeFirst()
removeLast()
removeFirstOccurance(Object)
removeLastOccurance(Object)
peekFirst/Last() //retrieve and don't remove
offerFirst/Last() //insert
pollFirst/Last() //retrieves and removes

Data Binding

There are two types XML data binding and UI data binding. It's a technique that binds data with the backend/ability to access and retrieve that data...I guess you can call it logic. Any change in the data will be reflected in the element bound to that data.

Examples of UI binding: WPF, Cocoa with nibs and variables
UI elements are tied to a variable. Data changes to that variable will be reflected in the UI element immediately. This can be implemented using event triggers or notification/Observer pattern.

Example of XML binding: SOAP
An object is used to represent the XML data. One would access the object to see retrieve data about the XML.

Merge Sort

Merge sort is a divide and conquer algorithm that works by dividing a list continuously and then merging/sorting the smaller lists.

Average and Worst case running time: O(n logn)
n, because you visit all the elements to merge them. log n, because you're constantly split the array in 2, so n/2

Note: It is faster than quick sort because it has a lower constant.

Most implementations of quick sort require O(n) space. That's why some people prefer heap sort which is also O(nlogn) running time, but requires only constant space (O(1)).

An interesting note: Java uses a modified version of merge sort for their sort() algorithm

Implementation of merge sort that you can run:

import java.util.List;
import java.util.ArrayList;

class MergeSort {

List <Integer> list;

MergeSort() {
list = new ArrayList<Integer>(6);
System.out.println("size before we put anything in it: "+list.size());
list.add(new Integer(1));
list.add(new Integer(5));
list.add(new Integer(3));
list.add(new Integer(4));
list.add(new Integer(20));
list.add(new Integer(0));
list.add(new Integer(6));
list.add(new Integer(10));
}

public static void main (String args[]) {
MergeSort ms = new MergeSort();
ms.sort();
}

void sort() {
if (list != null) {
List<Integer> result = mergesorthelp(list);

for (Integer i: result) {
System.out.println(i);
}
}
}

List<Integer> mergesorthelp(List<Integer> list){

if(list.isEmpty() || list.size() == 1) {
return list;
}

List<Integer> a, b, result;

a = mergesorthelp(list.subList(0, (list.size()/2)));
b = mergesorthelp(list.subList((list.size()/2), list.size()));

result = merge(a, b);

return result;
}

List<Integer> merge (List<Integer> a, List<Integer> b) {
List<Integer> newList = new ArrayList<Integer>();
int indexA = 0;
int indexB = 0;
int sizeA = a.size();
int sizeB = b.size();

while (indexA < sizeA || indexB < sizeB) {
if (indexA < sizeA && indexB < sizeB) {
if (a.get(indexA) > b.get(indexB)) {
newList.add(b.get(indexB));
indexB++;
}
else {
newList.add(a.get(indexA));
indexA++;
}
}
else if (indexA < sizeA) {
newList.add(a.get(indexA));
indexA++;
}
else if (indexB < sizeB) {
newList.add(b.get(indexB));
indexB++;
}
}//end while

return newList;
}
}

Wednesday, October 19, 2011

Simple Java program to test how much time an action takes

public class TimeTest {
public static void main(String[] args) {
int count = 0;
long startTime;
startTime = System.currentTimeMillis();
while (count < Integer.MAX_VALUE) {
count++;
if (count % 1000 == 0) {
System.out.println(count);
}
}
System.out.println("Time in ms: "+ (System.currentTimeMillis() - startTime));
}
}

Threads vs Processes

First what are threads/processes?
A chunk of code that can be execute in parallel while a program is doing something else. It is scheduled by the operating system.

Why are they important?
Because why wait and completely block all programs because of a a program that's waiting on user input when you can run another program/process in the background that needs to do something else like read in a file.

Threads:
Within a process share the same address space
Monitored through debuggers
Belongs to one process


Processes:
Can have many threads in them or be singly threaded, an execution instance of a program
Have separate memory address spaces
Can be monitored by programs like 'top'
Communicate to each other via inter-process communication
You launch a process that launches the thread(s)

Tuesday, October 18, 2011

Red Black Tree

Self balancing binary search tree.

Searches, Inserts, and Deletes in O(log n) where n is the number of items in the tree.
Its leaf nodes don't contain data (are null).


In Java, TreeMap implements this. TreeMap guarantees the keys of the map will be in ascending order sorted according to the comparator provided in the constructor upon creation of the map.

Hashtable vs HashMap

HashTableHashMap
SynchronizedUnsynchronized
Does not allow null keys or valuesAllows one null key and any number of null values
Extends DictionaryExtends AbstractMap
Default capacity is 11Default capacity is 16
                     Both implement the Map interface
                     Default load factor of 0.75, then they resize


I also read that HashMap does not guarantee the order of the map to remain constant over time.

Synchronized in Java

If a method or class is synchronized, it means it is thread safe.

You can make a non-synchronized collection synchronized by adding the following line:

Map m = Collections.synchronizedMap(new HashMap(...));

You can make a block of text/access to a class synchronized by doing this:

synchronized(m) {
//do something
}

If you choose option 1, make sure you use the reference return by the synchronizedMap function.
If you choose option 2, make sure you synchronize every access to that object/collection.

Synchronized vs Unsynchronized

The first implementation of some Java classes were thread safe, synchronized against threads accessing their data at the same time. However, Java then realized many programs are single threaded and this is overkill as synchronization adds an overhead, so they introduced the same versions of those classes but unsynchronized (and thus faster).


SynchronizedUnsynchronized
StringBufferStringBuilder
VectorArrayList
HashTableHashMap


Note on synchronization: Only concurrent access to the same class is synchronized, you may still need to do additional synchronization outside of that class. You can't add or remove from the same class at the same time, but you might want additional synchronization, see example:


Thread 1Thread 2

if (! list.isEmpty()) {
//print list
}

for (i = 0; i < size; i++) {
list.remove();
}

Thread 1 obtains lock on the list and checks the it isn't empty. It suspends before it prints the list. Thread 2 removes a bunch of items. Thread 1 continues and now is printing out an empty list (not what it wanted to do).

dead beef

Q: What is dead beef?

A memory address used by programmers to help them debug. It is a 32 bit hex value.
Example:
0xDEADBEEF

Each hex, takes up 4 bits.

Depending on the system, it can mean different things. One system may say, you've dead lock issue if you get this error. Another might put out this error if you have memory corruption, or access a word at the wrong boundary, etc.


Lazy Initialization

Concept of delaying the creation of an object or computation of something until the first time it is needed.

Adsense vs Adwords

AdSense
A program (free) that allows online publishers (websites, stores, blogs, etc) to make money by placing relevant ads on their site. Websites earn the money through people clicking on the ads. You can also easily add google search to your site through this program, and earn money through the ads displayed on the search results page. Basically, Google is paying websites to allow them to put ads on their site.

AdWords
Google's program that allows business to buy ads to be displayed on websites (either Google's or Google's ads network). You only have to pay if someone clicks on your ad.

Monday, October 17, 2011

Bubble Sort

With this algorithm, you go through the array checking if each value is bigger than the one after it and you propagate/"bubble" the bigger item up the array (towards the end) by swapping it with the item next to it. Going through the array/list once constitutes as one pass. You will make as many passes as needed until the array is sorted(no swapping has taken place on the last pass).
Best Case: O(n), everything already sorted. You take one pass to figure that out.
Worst case running time: O(n^2).
In place: Yes.

Code:

swap = false;

do {
swap = false;
for (i = 0; i < size -1; i ++) { //you want to stop at the one before last since you're comparing the next item
if (arr[i] > arr[i+1]) {
swap (arr[i], arr[i+1]);
swap = true;
}

} while(swap)


What you will realize is that with the first pass, you're placing the biggest element at the end of the array (fixed in its correct place). 2nd pass: 2nd biggest element, etc. So you don't really need to iterate through the whole array every time, but arr.size - the pass you are on (since the items will be fixed in place).

swap = false;
int index = arr.size;

do {
swap = false;
for (i = 0; i < index -1; i ++) { //you want to stop at the one before last since you're comparing the next item
if (arr[i] > arr[i+1]) {
swap (arr[i], arr[i+1]);
swap = true;
}

index--;

} while(swap)



You will realize that for every element that is not in its place(off by 1), you will need 1 full pass to fix it it + final pass to go through the array and make sure everything is good.
1 swap to get arr ordered = 2 passes
2 swaps to get arr ordered = 3 passes
n = n+1 passes

To keep it easy, we will define 1 pass to mean you've visited all the elements (n). In reality its more like n-i, with i being pass count because as you put items in place you don't need to traverse to that point in the array. So if you know an array is almost sorted, say 4 elements are out of place/4 swaps are needed for the array to be sorted, then Big O is O(5n). Might be better than some other algorithms.

So worst case, the element is very off (at the other end of the array), you will need n+1 passes to fix it, and each pass you're examining n items. Thus, O(n^2).

Wednesday, October 12, 2011

Nil, nil, null, NULL

Ahh the differences between the various nulls that all kind of typedef to 0. Mainly we have them as a convention to denote certain things.

Java
null - a reserved constant that denotes a pointer to a void object
Internally represented as zero. However, you do not compare an int to this.

Javascript
undefined - object hasn't even been assigned yet

null - it is assigned and its value is null (nothing)

if you compare the two, you'll get that they are equal to each other, but they are separate concepts and if you want to find out if a variable is undefined, don't compare it to null.

C
NULL - (void *)0, to be used with pointers
You cannot use this with ints, you might get an error.

C++
NULL - 0
Can be used with pointers and ints. Signifies pointer not pointing to anything.

Objective-C
nil - to be used with object (id) to indicate 'no value'.

Nil - to be used with class pointer to indicate or point the Objective C class to 'no value'.

NSNull - a way of representing nil/null. Where you cannot pass null or use a null value, use this non-null class to represent a null value.

Tuesday, October 11, 2011

You are at a party with a friend and 10 people are present including you and the friend. Your friend makes you a wager that for every person you find that has the same birthday as you, you get $1; for every person he finds that does not have the same birthday as you, he gets $2. Would you accept the wager?

There is a 1/365 chance that you share the same birthday with a random person, and 364/365 chance you don't. Don't take the wager.
What is the angle of a clock's hands when the clock shows 3:15?

360 degrees in a full circle. 90 degrees per quarter (that breaks up every 3 hours segment).

So that's 30 degrees per hour. Each hour can be broken up into 4 parts (one for every 15 minutes), 30/4 = 7.5 degrees.

The minute hand would point to 3, and the hour hand would be 1/4 of the way to 4. It is a quarter of the way to 4, because only 15 minutes (which is 1/4 of 60 mintes) have passed.
Say an advertiser makes $0.10 every time someone clicks on their ad. Only 20% of people who visit the site click on their ad. How many people need to visit the site for the advertiser to make $20?

Number of people needed to click to make $20:
10 people to make a $1
X 20
200 people to make $20

Those 200 are the 20% that clicked.

(20/100) * x = 200

20 * x = 20000
x = 1000

So 20% of a 1000 people is 200. So a 1000 people.

Saturday, October 8, 2011

Why?

Why this blog? Because every time I want to interview, I scramble to a heap of papers with coffee stains on them, from X years ago, underneath a dusty desk, that have some questions on them to do some practice. Instead, I've decided to stream line this process. I'll post this online and this way it will be organized and I can even search for it quickly! Also, I won't have to deal with dust bunnies O_o

The blog will contain questions I've ever come across and will aim to explain concepts for other's benefit as well. Happy Coding!