commits
This PR resolves #156
Implemented a new `server_command` function to serve `.scrap` files over
HTTP, allowing retrieval by path or hash. Added error handling for
missing files. Included unit tests for server functionality in
`scrapscript_tests.py` to ensure proper operation and error responses.
---------
Co-authored-by: Max Bernstein <tekknolagi@gmail.com>
Co-authored-by: Max Bernstein <max@bernsteinbear.com>
I added a function called `Token.with_source` to simplify the
repeated logic in the source extent tests, but if it's unnecessary I can
easily revert those commits.
Edit: Linking the PR for easy, future reference: #241.
I have implemented the functionality necessary to preserve source
information through the parsing stage.
---------
Co-authored-by: Max Bernstein <max@bernsteinbear.com>
It now recommends that users use `uv` and `scrapscript_tests.py` for
testing.
This pull request adds a `class Peekable` which augments an argument
`Iterator` with a `peek` method. The parser now takes as input an
instance of `class Peekable` that produces a series of tokens rather
than relying on a materialised `list` of tokens. For right now,
`tokenize` is altered to wrap the returned `list` of tokens in an
instance of `class Peekable`, but, with another PR, we should be able to
complete issue #84 and make the lexer fully pull-driven by turning it
into a proper `Iterator`, which we would wrap within an instance of
`class Peekable`.
---------
Co-authored-by: Max Bernstein <max@bernsteinbear.com>
The first batch of commits in this pull request involve minor refactors
that were found to be potentially worthwhile in the course of
development.
The latter half of this PR adds code in service of eventually closing
#86. The main additions are the classes `SourceLocation` and
`SourceExtent`. `class SourceLocation` is a dataclass that stores a line
number, column number, and byte index into the scrapscript program
source.
`class SourceExtent` has `start` and `end` members of type
`SourceLocation` to represent a swathe of contiguous source code from
the user's scrapscript program. The line and column numbers are
1-indexed as before whereas the byte numbers are 0-indexed. All end
indices are inclusive of the final characters of their corresponding
tokens, i.e. they do *not* demarcate the positions of "one past the end"
of the tokens.
The classes `Token` and `Lexer` are modified to work in terms of
`SourceExtent` so that the lexing phase can maintain richer source
information. This new code structure will hopefully be more robust and
hopefully allow for easier refactoring in the future if needed.
I expect to do a follow-up PR that can build atop this one to maintain
richer source information tracking for the parsing stage.
Make type inference match run-time behavior. Add tests.
Don't bother creating a Binop; we can catch this. It makes (testing)
some stuff in the compiler easier.
It's unused and we can get rid of the global fn_counter.
`parse` now calls into `parse_binary` and `parse_unary` and is a bit easier to read.
I started reading through the parsing code and noticed that this token
was being unused. Let me know if you'd like me to make any further
changes!
---------
Co-authored-by: Abel Sen <abelsen@Abels-MacBook-Air.local>
We don't need it; we should just only return *unbound* type variables in
`ftv_ty` and only apply substitutions to unbound type variables in
`apply_ty`.
(I think.)
- **WIP: test, but working row poly**
- **.**
- **.**
- **.**
- **Add comments**
- **.**
- **Add tests**
- **.**
- **.**
Still TODO:
* Record typing
* Row polymorphism
* Variant typing
Co-authored-by: @rdck
Also remove Compose because it doesn't really represent anything other
than a new function
- Also verify that the shadow stack points into the heap
- Scan from base of heap, not to/from space
- Use more expressive error
- When growing the heap, verify before swapping in new space
Goes from a assembly mess with SIMD and stuff to:
```
0000000000002070 <small_string_concat>:
return (((uword)obj) >> kImmediateTagBits) & kMaxSmallStringLength;
2070: 89 f8 mov eax,edi
2072: c1 e8 05 shr eax,0x5
2075: 83 e0 07 and eax,0x7
uword length = small_string_length(a_obj) + small_string_length(b_obj);
uword result = ((uword)b_obj) & ~(uword)0xFFULL;
2078: 48 89 f2 mov rdx,rsi
207b: 48 81 e2 00 ff ff ff and rdx,0xffffffffffffff00
result <<= small_string_length(a_obj) * kBitsPerByte;
2082: 89 c1 mov ecx,eax
2084: c1 e1 03 shl ecx,0x3
2087: 48 d3 e2 shl rdx,cl
result |= ((uword)a_obj) & ~(uword)0xFFULL;
208a: 48 81 e7 00 ff ff ff and rdi,0xffffffffffffff00
result |= length << kImmediateTagBits;
2091: 48 c1 e0 05 shl rax,0x5
2095: 81 e6 e0 00 00 00 and esi,0xe0
209b: 48 01 f0 add rax,rsi
209e: 48 09 f8 or rax,rdi
result |= kSmallStringTag;
20a1: 48 09 d0 or rax,rdx
20a4: 48 83 c8 0d or rax,0xd
struct object* result_obj = (struct object*)result;
return result_obj;
20a8: c3 ret
```
Implemented a new `server_command` function to serve `.scrap` files over
HTTP, allowing retrieval by path or hash. Added error handling for
missing files. Included unit tests for server functionality in
`scrapscript_tests.py` to ensure proper operation and error responses.
---------
Co-authored-by: Max Bernstein <tekknolagi@gmail.com>
Co-authored-by: Max Bernstein <max@bernsteinbear.com>
This pull request adds a `class Peekable` which augments an argument
`Iterator` with a `peek` method. The parser now takes as input an
instance of `class Peekable` that produces a series of tokens rather
than relying on a materialised `list` of tokens. For right now,
`tokenize` is altered to wrap the returned `list` of tokens in an
instance of `class Peekable`, but, with another PR, we should be able to
complete issue #84 and make the lexer fully pull-driven by turning it
into a proper `Iterator`, which we would wrap within an instance of
`class Peekable`.
---------
Co-authored-by: Max Bernstein <max@bernsteinbear.com>
The first batch of commits in this pull request involve minor refactors
that were found to be potentially worthwhile in the course of
development.
The latter half of this PR adds code in service of eventually closing
#86. The main additions are the classes `SourceLocation` and
`SourceExtent`. `class SourceLocation` is a dataclass that stores a line
number, column number, and byte index into the scrapscript program
source.
`class SourceExtent` has `start` and `end` members of type
`SourceLocation` to represent a swathe of contiguous source code from
the user's scrapscript program. The line and column numbers are
1-indexed as before whereas the byte numbers are 0-indexed. All end
indices are inclusive of the final characters of their corresponding
tokens, i.e. they do *not* demarcate the positions of "one past the end"
of the tokens.
The classes `Token` and `Lexer` are modified to work in terms of
`SourceExtent` so that the lexing phase can maintain richer source
information. This new code structure will hopefully be more robust and
hopefully allow for easier refactoring in the future if needed.
I expect to do a follow-up PR that can build atop this one to maintain
richer source information tracking for the parsing stage.
Goes from a assembly mess with SIMD and stuff to:
```
0000000000002070 <small_string_concat>:
return (((uword)obj) >> kImmediateTagBits) & kMaxSmallStringLength;
2070: 89 f8 mov eax,edi
2072: c1 e8 05 shr eax,0x5
2075: 83 e0 07 and eax,0x7
uword length = small_string_length(a_obj) + small_string_length(b_obj);
uword result = ((uword)b_obj) & ~(uword)0xFFULL;
2078: 48 89 f2 mov rdx,rsi
207b: 48 81 e2 00 ff ff ff and rdx,0xffffffffffffff00
result <<= small_string_length(a_obj) * kBitsPerByte;
2082: 89 c1 mov ecx,eax
2084: c1 e1 03 shl ecx,0x3
2087: 48 d3 e2 shl rdx,cl
result |= ((uword)a_obj) & ~(uword)0xFFULL;
208a: 48 81 e7 00 ff ff ff and rdi,0xffffffffffffff00
result |= length << kImmediateTagBits;
2091: 48 c1 e0 05 shl rax,0x5
2095: 81 e6 e0 00 00 00 and esi,0xe0
209b: 48 01 f0 add rax,rsi
209e: 48 09 f8 or rax,rdi
result |= kSmallStringTag;
20a1: 48 09 d0 or rax,rdx
20a4: 48 83 c8 0d or rax,0xd
struct object* result_obj = (struct object*)result;
return result_obj;
20a8: c3 ret
```