some inteview questions

Q1: Assume your assembly language includes ONLY the following instructions:

'inc REG': increments a given register by one. 'dec REG': decrement a given register by one. 'jnz LABEL': jumps to a given LABEL if the previous instruction's result was not zero. 'HALT': stops running.

Task: A and B registers hold non-negative values. The program should calculate the value of |A-B| and locate the result in C. In addition, the language holds registers C,D,...,Z, which you can assume are initialized at program start to zero.

answer:  
// save the A B first

SaveA:
  int D
  int E
  dec A
  jnz SaveA
Save_B:
  inc F
  int G
  dec B
  jnz Save_B

D: // set A
  inc C
  dec D
  jnz D
DECB:
  dec C
  jnz T2

B:
  inc C
  dec G
  jnz G
DECA:
  dec D
  jnz DECA
  halt   
T2:
  dec G
  jnz DECB

Q2 :The performance of a program is slower than the same program before the backend of the compiler has changed. What direction would you look into this issue and solve it?

Maybe the IR is a good start point for llvm

Q3: Optimize this code (like what compiler would reconstruct an equivalent source code as):

int fn (int a, int b) { 
  int sum = 0; 
for(int i=4*a;i>0;i--) { 
  sum+=b*i*i; 
} 
return sum; 
}

I'm not sure the correct answer

what i can do

1 i=4*a; ==>  i= a>>2;
3 loop expand 

4 sse optimize

Q4 Given a sequence of instructions, form a control flow graph (CFG) for it. Then write the instructions in SSA form. Then perform a specified code transformation on the CFG. Often this will be PRE - partial redundancy elimination.

Q5 Given two sequences of instructions that are functionally the same but have reordered operations, describe why you’d prefer one over the other and under which conditions. I usually ask this in the context of GPUs with Apple Metal or CUDA.

Q6 Given a (bad) sequence of code, describe what it does. Tell me what’s wrong with it, and how a static analysis could (or could not) detect the error at compile time. Build the static analysis.

Q7 For compiler engineers that know Python: Here is some Python code that appends to a string. It takes a really long time to run. Why? For backend compiler work: what are the limitations of X architecture (GPU, VLIW, out-of-order superscalar, FPGA, something else)? What is it good at running? What is it not good at running? Which code transformations, manual or automatic tend to improve performance?

Q8 Some code doesn’t run fast enough, so the developer added explicit prefetching. Now the code runs even slower. Why? For positions that may require adapting or creating a programming language: tell me what feature Y does for language Z, what it enables, and how does it complicate your job? Can you think of alternatives?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some inteview questions

Clone this wiki locally