Disallow the zero Unicode character in all input.
Halibut works internally with standard C-style null-terminated strings
(or rather wide strings), so L'\0' appearing unexpectedly in the input
can cause all kinds of havoc.
It would be nice to redo all the string processing using (pointer,
length) pairs and become robust against that, but I don't think it's
realistic without a major rewrite. Zero characters have no actual use
that I can see, so a simpler fix is to just outlaw them completely.
This applies to a direct \0 appearing in the input file, and also to
any sneaky attempts to enter one via \u0000.
Simon Tatham
2 years ago
82 | 82 |
do_error(NULL, "no data in input files");
|
83 | 83 |
}
|
84 | 84 |
|
|
85 |
void err_zerochar(errorstate *es, const filepos *fpos)
|
|
86 |
{
|
|
87 |
es->fatal = true;
|
|
88 |
do_error(fpos, "the Unicode zero character is not permitted in input");
|
|
89 |
}
|
|
90 |
|
85 | 91 |
void err_brokencodepara(errorstate *es, const filepos *fpos)
|
86 | 92 |
{
|
87 | 93 |
es->fatal = true;
|
224 | 224 |
void err_cantopen(errorstate *es, const char *sp);
|
225 | 225 |
/* no data in input files */
|
226 | 226 |
void err_nodata(errorstate *es);
|
|
227 |
/* unexpected zero character in input file */
|
|
228 |
void err_zerochar(errorstate *es, const filepos *fpos);
|
227 | 229 |
/* line in codepara didn't begin `\c' */
|
228 | 230 |
void err_brokencodepara(errorstate *es, const filepos *fpos);
|
229 | 231 |
/* expected `}' after keyword */
|