Before really putting any comment on "Parallel or Distributed Computing" I was curious to know why we really need it ? I remember famous quote of Moore
"CPUs are double faster in every 18 months ...."
I believe this law still some how hold the truth but than big question is "why we need any thing called parallel or Distributed Computing when I know I have 2 times faster CPUs in next 18 months ? "
[1] The data in the web is more than doubling in few months only ....In last few years we are adding more and more devices which put more and more data on the web ...I Phone, I Pad , Kindle devices all of these are new source of data generation and access on the web.
[2] Moore's law might not be keep holding truth with time go : As we are moving to world where every one prefer devices with small and small in size and weight ... We already reaching to a point where chip size will reach to it's minimum size and getting more instructions execution per second out of it will not be that easy
[3] Even with a super Computer it might not be easy and fast to create 50 thousands animated soldiers in the war of "The Mummy " movie. Where each animated solider composed of millions of frames and each frame have it's own effect and attributes.
There are hundred more reasons which were the reason why Computer Scientist started looking on "Parallel and Distributed Systems around 20-30 years back" and today we reached to a stage where no Software Company across the world can breath with out Distributed Computing.
I have to first understand what exactly Distributed Computing is ?
I start with a very small example. Any software programmer start his programming life with a For loop code so let's see how it work in different scenarios
for(int i=0; i< 100 ; i++ )
{
c[i] = a[i] + b[i];
}
A machine which have a single CPU core to execute this program :
For each iteration of i , value of a[i] and b[i] will be loaded in memory and than addition operation will happen . Suppose time time taken in loading this data in memory is T than some how around it will take 100 * T time to complete this execution ...
Parallel Programming:
Suppose you have a N Core CPU and your compiler is brilliant enough to work in Parallel programming paradigm so In Parallel it will keep uploading data for N addition operation and executing the plus operation and some how your the same for loop execution it will take time around ( 100 * T / N )
In laymn's term we can say in Parallel programming CPU instructions run in parallel which share same In Memory data. There is no role of network or data on wire in it.
Now we move to Distributed Programming where we have M number of machines , each of them have them have N CPU Core . These machines are connected by a Network . We again have same task to run in
In this case each machine will be busy in loading data in memory for N addition operation and than doing the plus operation. It will reduce the total time taken for our task to ( 100 * T / N * M ) + X ...
What is this X ? I will try to cover in next post of this blog.
"CPUs are double faster in every 18 months ...."
I believe this law still some how hold the truth but than big question is "why we need any thing called parallel or Distributed Computing when I know I have 2 times faster CPUs in next 18 months ? "
[1] The data in the web is more than doubling in few months only ....In last few years we are adding more and more devices which put more and more data on the web ...I Phone, I Pad , Kindle devices all of these are new source of data generation and access on the web.
[2] Moore's law might not be keep holding truth with time go : As we are moving to world where every one prefer devices with small and small in size and weight ... We already reaching to a point where chip size will reach to it's minimum size and getting more instructions execution per second out of it will not be that easy
[3] Even with a super Computer it might not be easy and fast to create 50 thousands animated soldiers in the war of "The Mummy " movie. Where each animated solider composed of millions of frames and each frame have it's own effect and attributes.
There are hundred more reasons which were the reason why Computer Scientist started looking on "Parallel and Distributed Systems around 20-30 years back" and today we reached to a stage where no Software Company across the world can breath with out Distributed Computing.
I have to first understand what exactly Distributed Computing is ?
I start with a very small example. Any software programmer start his programming life with a For loop code so let's see how it work in different scenarios
for(int i=0; i< 100 ; i++ )
{
c[i] = a[i] + b[i];
}
A machine which have a single CPU core to execute this program :
For each iteration of i , value of a[i] and b[i] will be loaded in memory and than addition operation will happen . Suppose time time taken in loading this data in memory is T than some how around it will take 100 * T time to complete this execution ...
Parallel Programming:
Suppose you have a N Core CPU and your compiler is brilliant enough to work in Parallel programming paradigm so In Parallel it will keep uploading data for N addition operation and executing the plus operation and some how your the same for loop execution it will take time around ( 100 * T / N )
In laymn's term we can say in Parallel programming CPU instructions run in parallel which share same In Memory data. There is no role of network or data on wire in it.
Now we move to Distributed Programming where we have M number of machines , each of them have them have N CPU Core . These machines are connected by a Network . We again have same task to run in
In this case each machine will be busy in loading data in memory for N addition operation and than doing the plus operation. It will reduce the total time taken for our task to ( 100 * T / N * M ) + X ...
What is this X ? I will try to cover in next post of this blog.