-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtodo.txt
352 lines (253 loc) · 10.1 KB
/
todo.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
Bugs:
- comparisons.in -> compilatin takes a lot of time, prob register allocation again
- umm what?
# Instruction: b = %temp2 # local
movl -8(%ebp), %ebx
movl %ebx, -24(%ebp) # Store spilled
movl -24(%ebp), %ebx # Load spilled
movl %ebx, %ebx
movl %ebx, -28(%ebp) # Store spilled
movl -28(%ebp), %ebx # Load spilled
andl $0x1, %ebx
movl %ebx, -28(%ebp) # Store spilled
movl -28(%ebp), %ebx # Load spilled
cmp $0x1, %ebx
jne error_assert_failure
movl -24(%ebp), %ebx # Load spilled
subl $0x1, %ebx
movl %ebx, -24(%ebp) # Store spilled
movl -20(%ebp), %ecx # Load spilled
movl -24(%ebp), %ebx # Load spilled
movl %ecx, 24(%ebx)
With this source:
let a = 0;
let b = a + 1;
let c = a + 2;
print(b);
print(c);
Reduce type confusion:
- Add more error handling (runtime + generated code)
-- Errors for < > <= >=
-- Wrong parameter count
-- Calling something which is not a function
- Add metadata for arrays (array length), change array indexing code.
- Add strings, string handling runtime functions, and metadata for strings.
- Add taggedness checks to generated code.
- Add type metadata for internal classes like FunctionContext / Function and check it before indexing the class.
- Represent FunctionContext and Function as classes in the code generation (so that the indexing code + type checks are generated automatically)
- Add user-defined objects and type checks for them.
- When I add objects, I also need to annotate parameters / variables, so that I know what x.prop means.
-- Then I need to also disallow assigning an object to a different type into a variable.
- Then the compiler could whine at build time if I pass an object of the wrong type..
Efficiency:
- No redundant tag checks, keep track of what has been tag checked already. For this we probably need to introduce the notion of "type of a variable".
- No redundant loads if the value is already in a (virtual) register.
- Also no need to store the variable back if the value hasn't changed.
- But it's hard... e.g. a basic block could call a function that changes the value of an outer function variable, so how can we even do this?
Priorities:
1. functions improvements:
-- as first-class
2. strings
3. objects (need to figure out the type system!)
4. separate compilation
5. optimizations
6. whatever I might need for implementing the programming language in itself :)
-- data structures (maybe implemented by the runtime? or call into c++?)
-- I/O
7. misc
-- add timing printouts to tests
-- syntax error positions
-- need better object descriptors (maybe not everything that looks like a pointer is a pointer, if we're inside an object) (at least needed for implementing bigger numbers)
8. Errors
-- Throw a runtime error if we try to do "pointer + int" or sth.
-- If we try to index into an integer
-- Array with negative length (runtime func needs to be able to throw an error)
-- passing a ptr where we should pass an int (e.g., length of array)
Random bugs / corner cases:
- Missing return statement causes a segfault (e.g., case where the func was meant to return an inner func)
- Check all usages of PARegisterAndOffset; shouldn't we multiply by POINTER_SIZE everywhere?
- Not sure if the "locals and params count" is correct, maybe that's multiplied by POINTER_SIZE..
- Add tests for what happens when trying to access the return value for print.
- Too many syntax errors: grammar.SyntaxError: SyntaxError: Syntax error: Expected identifier_or_array_continuation, got token_identifier. (33) (for "asdf asdf")
- This test:
function outer(a) {
function inner() {
a = a + 1;
return a;
}
return inner;
}
let f = outer(4);
f(); // must return 5
f(); // must return 6
This tests that the function context is kept alive.
Arrays:
- Parsing more complicated cases doesn't work (probably)
let bar = foo()[3];
- Also longer: foo()() or foo()[0]() etc.
Small stuff:
- test for exact error message for non-runtime errors too
-- make the program exit value be non-zero in case of runtime error
- function comparison doesn't work
- print out the statement as comments in the program?
- mark some functions staticmethods if we don't need an object...
- Allow negative numbers but not always... so e.g., 1 + (-1) is OK but 1 + -1 is not.
- logical operators!! note that they need to shortcut both in the interpreter and in the compiler.
- python interpreter.py from stdin? or interactively?
- should using an undefined return value be a runtime error? i guess yes, it's not like it's ever intended...
- comparing functions with ==, does it work??
Medium stuff:
- register allocation is still damn slow
- the whole undefined thing... we need a representation for undefined, we need to init variables as undefined, and check for undefined in suitable places.
- break and continue
- should we allow: outer()(); when outer returns a function? Now the lang forces assigning it to a variable... in general, we should allow expression();
- clearer distinction between the phases. when we're done with cfg creation, we should just discard the code, for example. but we must keep the results of the scope analysis around...
- reorganize statements such as... a + b * c -> b * c + a, a + 5 + b -> a + b + 5, and so on.
Big stuff:
- Proper type system, distinguish between different functions!
-- add types like strings and such
-- when assigning functions it needs to be so that the assigned variable gets the proper type too, this way we can do the "param number" check at scope resolution time.. iguess??
- Objects (so that we can also return several values)
- Runtime
- Compatibility with C libraries
Random ideas:
- Make it possible to have several return values. Because why not!
------------------------
Plan for the middle + backends:
- convert into a representation with virtual registers (maybe cfg + virtual registers) for arithmetic operations + function calls (no returning functions or any of that complex stuff)
- interpreter for the intermediate language and testing framework for it
- register allocator
- real code generation for arithmetic operations
- proof of concept for the runtime
- objects in the parser
- "give me a new object" runtime function which first just consumes all the memory and doesn't collect garbage
- some part must keep track of what the local variables of a function are.
Later:
- garbage collection
- add returning functions, mucking with contexts and all that complex stuff
Really later:
- code improvements
--------------------------
Expression hoisting:
Note that in this case, the expression a = b cannot be moved out of the loop:
let a = 0;
let b = 0;
let c = 0;
if (c == 0) {
b = 5;
a = b;
b = 4;
} else {
a = b;
}
------------------------
# TODO: phi functions - when to create? Or do we need them here, can't we just
# use the actual variables names still? And virtual registers just for temporary
# values?
# TODO: augment this for more complex instructions, such as creating a function
# context, returning stuff, calling into runtime whenever needed, etc.
# let d = a + b + c;
# temp = a + b;
# d = temp + c;
# or maybe even:
# temp = a + b;
# temp2 = temp + c;
# d = temp2;
# But if we go full ssa, we still need to know which variables are local and
# which are global and so on, how does this all work? Variable contains that
# information.
# We probably need a better notation of a language that we can use here.
let a = 0;
function foo() {
let b = 0;
a = 6;
if (0 == 0) {
let c = 4;
c = 4;
} else {
a = 7;
}
b = 8;
}
function bar() {
while (0 == 0) {
let c = 4;
}
}
-------------------
- variable can be in local scope, top scope, or outer function context. is the outer function scope taken into account at all?
- Gather information about:
- what are the top level variables
- what are the local variables for each function
- which variables are referred to from the inner functions (need to be in function context)
-> all of these should be in the scope analyser!!!
- what's the shape of the function context
function() {
let a = 0;
if (0 == 0) {
let a = 0;
}
}
function has 2 variables: a und a. All references are resolved, so we know what refers to what.
a = 1;
a = 1 + 2;
a = b + c;
a = foo();
a = a;
a = b;
a = foo();
call foo
b = rv
a = b;
load b
what if b is a local
rb = load(fc + displacement)
store(rb, fc + displacement)
a = b + 2;
rb = load(fc + displacement)
rb = rb + 2
in the end, we store from a temporary to a local/global/param/outerlocal/outerparam.
- the runtime (or something) must know how to create function context objects. how will it know? the executable will need to store some shape information for functions.
function outer() {
let a = 5;
function inner() {
function innerinner() { let b = inner2; return b; }
return innerinner();
}
return inner;
function inner2() { write(a); }
}
let c = outer();
let d = c();
let e = d();
e();
---------------------------
inner function inside a block scope:
if (a == 0) {
let i = 8;
function inner() { return i; }
}
------
as -32 -ggstabs -o temp.o temp.as
gcc -m32 temp.o runtime/runtime.o runtime/builtins.o
0xffffcea0: 0x00000000 0x00000000 0xf7fa6cc0 0x00000000
0xffffceb0: 0x0804b02a 0xf1d3c600 0x00000030 0x00000000
0xffffcec0: 0xffffcee4 0x080485c0 0xffffcee4 0x080485d7
^^ callee push
0xffffced0: 0x0804b02a 0x00000000 0x0804b02a 0x08048735
^^ caller push ^^ caller push ^^ caller push ^^ spill
0xffffcee0: 0x0804b009 0xc0decafe 0xffffcf18 0x0804873d
^^ spill ^^ marker
0xffffcef0: 0x00000001 0x00000007 0xf7e22a50 0xffffcfc4
0xffffcf00: 0x00000001 0xffffcfc4 0x00000000 0xf1d3c600
^^ stack in main
0xffffcf10: 0xf7fa63dc 0xffffcf30 0x00000000 0xf7e0c637
0xffffcf20: 0xf7fa6000 0xf7fa6000 0x00000000 0xf7e0c637
0xffffcf30: 0x00000001 0xffffcfc4 0xffffcfcc 0x00000000
Runtime starting. Stack is at 0xffffcf08.
(gdb) disas
Dump of assembler code for function builtin_Array:
0x0804879e <+0>: push %ebp
0x0804879f <+1>: mov %esp,%ebp
0x080487a1 <+3>: sub $0x28,%esp
=> 0x080487a4 <+6>: mov 0x8(%ebp),%eax