# Question about GPP compiler

you will use the Gpp compiler to test and answer the following questions:(**Please personally test and record the relevant data to write a report, do not use any data and answers from the Internet.**)

1. Assume data cache is fully associative with 60 lines, each line can hold 10 doubles, the replacement rule is least recently used first, matrix elements are doubles, and n=4000. What is the total number of read cache misses for each matrix in each of the following three matrix multiplication algorithms? Implement these algorithms and verify the correctness of your implementation by checking the maximum error in the matrix C. Compile your code using gcc without any optimization flag and run your executables at any computer you can access. Rank the execution times and explain why the execution times are different.

/* ijk version */

for (i=0; i<n; i++)

for (j=0; j<n; j++)

for (k=0; k<n; k++)

c[i*n+j]=c[i*n+j]+a[i*n+k]*b[k*n+j];

/* ikj version */

for (i=0; i<n; i++)

for (k=0; k<n; k++)

for (j=0; j<n; j++)

c[i*n+j]=c[i*n+j]+a[i*n+k]*b[k*n+j];

/* kji version */

for (k=0; k<n; k++)

for (j=0; j<n; j++)

for (i=0; i<n; i++)

c[i*n+j]=c[i*n+j]+a[i*n+k]*b[k*n+j];

2.Compile the attached simple matrix multiplication code dgemm-simple.c using gcc without anyoptimization flag and report the execution time of the program on any computer you can access. Compile theattached optimized matrix multiplication code dgemm-optimized.c using gcc with optimization flag “-O3”and report the execution time of the program on the same computer you used for dgemm-simple. Calculatehow much faster dgemm-optimized runs than dgemm-simple. As we explained in class, architectureknowledge is very important while studying/designing compilers and operating systems, can you read theprogram dgemm-optimized.c and briefly explain how architecture knowledge may help programmers to writecomputer programs that run faster?