commits
Also refactor the apply tests to use a common type now that there are
three variants that do almost the same thing for each test.
Applying a file now rejects future text fragment applies under the
assumption that all relevant fragments were part of the file.
Try to avoid confusion about whether variables refer to byte or line
counts/positions. It still isn't perfect, but I'm not sure how to
further clarify without being overly verbose.
This adds minimal new coverage, since the apply tests already exercise
this, but it's complicated enough that dedicated tests will be helpful.
This removes the distinction between "strict" and "fuzzy" application
by allowing future methods on Applier that control settings. It also
avoids state tracking in the text fragment apply signature by moving it
into the Applier type.
While in practice, an Applier will be used once and discarded, the
capability is provided to reset it for multiple uses.
This is no longer used for text application, so revert the parser back
to using StringReader.
This is functionally equivalent to the previous version (except for one
error case), but uses the new interface. I think the code is simpler
overall because it removes the line tracking.
This is a line-oriented parallel to io.ReaderAt, meant for text applies.
While the mapping isn't quite as clean as in the binary case, a text
apply still reads a fixed chunk of lines starting at a specific line
number and modifies them. This also allows a consistent interface for
strict and fuzzy applies.
The implementation wraps an io.ReaderAt and reads data in chunks,
indexing line boundaries as it goes. This is probably not the most
efficient way to implement this interface, but it works and allows file
application to take a consistent interface.
This better matches the actual contract of the delta patch, which reads
fixed size chunks of the source files at arbitrary positions.
These cover some of the possible errors with delta fragments. Many of
the same tests could be implemented slightly easier by testing the
helper functions, but the infrastructure is already set up to test the
full function.
This is made easier by the bin.go CLI, which can parse and encode binary
patch data. Once parsed, fragments were manipulated with truncate and
dd, encoded, and placed in the patch files.
This ensures a default size is used for large copy instructions.
Covers basic literal and delta application. Input files were created by
using dd and the 'conv=notrunc' option to modify sections of files
created from /dev/urandom. All file start with two null bytes to
convince Git they are binary.
Currently untested, but based on code I validated against some generated
patches. Tests comming in a future commit.
Conflict errors are now represented by an exported type that is
compatible with errors.Is instead of using the method defined on
ApplyError. This makes the tests slightly cleaner and should be more
idiomatic for clients.
These cover the errors that can happen with a single fragment and no
additional manipulation by the caller of ApplyStrict.
This ensures the line number is used and application isn't based only on
matching context lines.
This matches the type used for positions in text fragments.
io.EOF was not properly accounted for when dealing with patches that are
missing trailing newline characters.
These cover all of the cases (I think) where application should succeed.
This wraps the underlying error with optional position information and
provides a way to test if the error was due to a conflict. At the
moment, details about the conflict are not exposed outside of the
message string.
Text fragment application is implemented but untested and with several
TODOs, binary fragment application is unimplemented and will panic.
Applying a fragment requires the content to match the stored counts, so
there must be a way to check this. Parsed fragments should always be
valid, but manually created or modified fragments may be invalid.
When applying, we need to copy all data after the last fragment line.
This function provides a way to get back the possibly-buffered io.Reader
that backs the LineReader.
Document the method as part of the type so it is processed by godoc.
LineReader is a wrapper around StringReader that counts line numbers.
This is not currently used by the parser, but will be used by the apply
functions.
The tests were already split up in this way, so it makes sense to have
the parsing functions split as well.
These are fairly basic as the decoder is also exercised by the fragment
parsing tests, but they cover some errror cases that may not be covered
otherwise.
Git uses a unique Base85 encoding with different characters than the
ascii85 encoding implemented by Go, so add a custom decoding function.
Once decoded, use zlib instead of the raw DEFLATE algorithm to
decompress the data.
These issues were caught by some basic parsing tests which are added
here as well.
Parse the fragment header separately from the fragment chunk, which
makes each function a bit more understandable.
Parse forward and optionally reverse fragments, decoding and inflating
the ascii85 encoded data in a binary patch. This is completely untested
at the moment and probably has obvious and stupid bugs.
The binary marker is the text that appears where a text fragment
normally would and indicates that the file is binary. It's not quite a
header, because content is optional in a binary patch. If the patch does
include binary fragments, they have their own format, with a header.
Fragment is now TextFragment to distinguish from a future
BinaryFragment. Also rename FragmentLine to Line, since the
text-orientation is implied by the name.
This allows callers to provide patches with commit or email headers and
then retrieve that leading content for additional parsing without
having to identify where the first file in the patch starts.
While it's cool that this works, in a multi-file patch each file
immediately follow the final fragment of the previous file.
For simplicity, use the same fragments in both of these files and in the
single file test.
For now, this uses JSON to print objects on error, which is hard to
debug. It should probably use something like google/go-cmp instead,
because these objects are now too large for direct comparison.
Primarily check that leading non-header content is ignored and that
special errors (like detached fragment headers) are raised.
This should almost always be handled specially, so don't let tests that
expect errors hide the fact that they returned an io.EOF. This could be
revisited if the tests are ever updated to check for specific errors.
Covers that multiple fragments actually works and the unique error
contidions, but otherwise relies on the lower level tests for
correctness. Also update some comments and the README based on
observations from writing/debugging these tests.
These are about to get even longer, so it makes sense to separate them.
Also refactor the remaining parser tests to remove some duplication.
To make output better, also add some (temporary?) String() functions and
fix an assumption about the smallest fragment header I discovered was
wrong while looking at sample patches.
Correctly process "\ No newline..." marker lines when parsing fragments,
triming the new line from the last read line and advancing the parser.
Also fix the text chunk parser to follow the invariant of not advancing
past the end of the object in the event of an error. The error in
question now has a correct line number as well.
The parser now returns an EOF on the first call to Next() if the input
is empty and increments the line number before returning EOF for the
first time.
Binary patch support is still unimplemented, but the functions are
stubbed and the overall structure seems to make sense. This also renames
the existing fragment functions to have "Text" in their names (for
clarity) and moves the parsing of fragment lines to a new "chunk"
function, which will match how binary parsing works.
This removes the distinction between "strict" and "fuzzy" application
by allowing future methods on Applier that control settings. It also
avoids state tracking in the text fragment apply signature by moving it
into the Applier type.
While in practice, an Applier will be used once and discarded, the
capability is provided to reset it for multiple uses.
This is a line-oriented parallel to io.ReaderAt, meant for text applies.
While the mapping isn't quite as clean as in the binary case, a text
apply still reads a fixed chunk of lines starting at a specific line
number and modifies them. This also allows a consistent interface for
strict and fuzzy applies.
The implementation wraps an io.ReaderAt and reads data in chunks,
indexing line boundaries as it goes. This is probably not the most
efficient way to implement this interface, but it works and allows file
application to take a consistent interface.
These cover some of the possible errors with delta fragments. Many of
the same tests could be implemented slightly easier by testing the
helper functions, but the infrastructure is already set up to test the
full function.
This is made easier by the bin.go CLI, which can parse and encode binary
patch data. Once parsed, fragments were manipulated with truncate and
dd, encoded, and placed in the patch files.
Git uses a unique Base85 encoding with different characters than the
ascii85 encoding implemented by Go, so add a custom decoding function.
Once decoded, use zlib instead of the raw DEFLATE algorithm to
decompress the data.
These issues were caught by some basic parsing tests which are added
here as well.
Correctly process "\ No newline..." marker lines when parsing fragments,
triming the new line from the last read line and advancing the parser.
Also fix the text chunk parser to follow the invariant of not advancing
past the end of the object in the event of an error. The error in
question now has a correct line number as well.
Binary patch support is still unimplemented, but the functions are
stubbed and the overall structure seems to make sense. This also renames
the existing fragment functions to have "Text" in their names (for
clarity) and moves the parsing of fragment lines to a new "chunk"
function, which will match how binary parsing works.