Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement undef-aware equality operators (PPC0030, PPC0031) #22942

Draft
wants to merge 6 commits into
base: blead
Choose a base branch
from

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Jan 24, 2025

As per PPC0030 and PPC0031

Currently only defines the stringy version of PPC0030 (equ), not the numerical version (===), as ongoing discussions about how to spell that remain open.

Adds all four of the PPC0031 variants (eq:u, ne:u, ==:u, !=:u).

TODO: Currently lacks any attempt at documentation or perldelta.

TODO: Also lacks any consideration on how an equ operator or eq:u flag would interact with use overload. Further thought is required here.

TODO: Lacks and tests or consideration on how actual chaining should behave.

This PR remains a draft due to the above TODO comments, as well as the fact it's entirely undecided whether PPC0030 or PPC0031 would actually be preferred.

@leonerd leonerd force-pushed the undef-aware-equality branch from bb4c810 to 042d130 Compare January 27, 2025 17:01
@leonerd leonerd changed the title Implement PPC0030 - undef-aware equality operators Implement undef-aware equality operators (PPC0030, PPC0031) Jan 27, 2025
@leonerd leonerd force-pushed the undef-aware-equality branch from 042d130 to ac020e6 Compare January 27, 2025 17:28
@@ -112,6 +114,7 @@
%type <pval> fieldvar /* pval is PADNAME */
%type <opval> optfieldattrlist fielddecl
%type <opval> termbinop termunop anonymous termdo
%type <svval> optopflags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new non-terminal needs clean-up in S_clear_yystack() in perly.c.

It could use dumping support in yy_stack_print() in perly.c too.

pp.c Outdated
@@ -2451,6 +2460,15 @@ PP(pp_seq)
SV *right = PL_stack_sp[0];
SV *left = PL_stack_sp[-1];

if(PL_op->op_private & OPpEQ_UNDEF) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numeric ops use UNLIKELY(), the string ops don't, is there a reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, no. Just laziness on my part. I added them to the numerical and forgot the string. I shall make consistent.

toke.c Outdated
Comment on lines 6241 to 6244
while (isLOWER(*s) || isUPPER(*s)) {
sv_catpvn(flagsbuf, s, 1);
s++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be better as something like:

const char *e = s;
while (isALPHA(*d)) // only one macro
    ++e;
 // only one function call, don't need SvPOK_on() either
sv_setpvn_fresh(flagsbuf, s, (STRLEN)(e-s));

Is the option string allowed to be zero length?

is eq:uuuuuuuuuuuuuuuuu meant to be allowed?

$ ./perl -le 'print $x eq:uuuuuuuuu $y'
1

This only accepts ASCII, what about my ==:ε that compares with an epsilon? (future directions)

Setting POK when there isn't yet a string defined seems risky (and is):

$ gdb --args ./perl -le 'print $x eq: $y'
...
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./perl...
(gdb) r
Starting program: /home/tony/dev/perl/git/perl6/perl -le print\ \$x\ eq:\ \$y
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00005555555de76c in Perl_apply_opflags (opcode=90, flagstr=0x0) at op.c:16201
16201       for(char flag; (flag = *flagstr); flagstr++) {
(gdb) bt
#0  0x00005555555de76c in Perl_apply_opflags (opcode=90, flagstr=0x0)
    at op.c:16201
#1  0x00005555556c6d36 in Perl_yyparse (gramtype=258)
    at /home/tony/dev/perl/git/perl6/perly.y:1296
#2  0x00005555555e68c3 in S_parse_body (env=0x0, 
    xsinit=0x55555559c1ff <xs_init>) at perl.c:2690
#3  0x00005555555e45c7 in perl_parse (my_perl=0x555555c242a0, 
    xsinit=0x55555559c1ff <xs_init>, argc=3, argv=0x7fffffffe7e8, env=0x0)
    at perl.c:1932
#4  0x000055555559c10e in main (argc=3, argv=0x7fffffffe7e8, 
    env=0x7fffffffe808) at perlmain.c:106

(I noticed that zero length strings were accepted, and wondered about the POK with no PVX, didn't realize it would crash until I tried it.)

Copy link
Contributor Author

@leonerd leonerd Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be better as something like:

const char *e = s;
while (isALPHA(*d)) // only one macro
    ++e;
 // only one function call, don't need SvPOK_on() either
sv_setpvn_fresh(flagsbuf, s, (STRLEN)(e-s));

Ahyes, that looks better.

Is the option string allowed to be zero length?

Mmm. A fun and interesting question.

On the one hand it might make code-generation simpler if it was allowed, but on the other hand it might lead to subtle bugs in e.g.

$x eq: func()

suddenly now looking for the f, u, n, and c flags.

A question for the PPC doc I feel.

is eq:uuuuuuuuuuuuuuuuu meant to be allowed?

$ ./perl -le 'print $x eq:uuuuuuuuu $y'
1

Again a fun question for the PPC doc. There's precedent with the regexp /ee combo, but I'm not sure that's the pinnacle of great design.

This only accepts ASCII, what about my ==:ε that compares with an epsilon? (future directions)

I guess no firm reason not to allow non-ASCII letters, just needs to stop at non-letter symbols. Which starts to get tricky to determine which outside of ASCII.

More good questions for the PPC doc :)

Setting POK when there isn't yet a string defined seems risky (and is):
...
(I noticed that zero length strings were accepted, and wondered about the POK with no PVX, didn't realize it would crash until I tried it.)

Yeah that's fixed by your suggestion.

As per PPC0030.

Currently only defines the stringy version, not the numerical version.

TODO: Currently lacks any attempt at documentation or perldelta.

TODO: Also lacks any consideration on how an `equ` operator would
  interact with `use overload`. Further thought is required here.

fixup with magic
Adds:
 * operator flag parser token type (OPFLAGS)
 * expected next token to be opflags or term (XOPFLAGTERM)
 * internal API function to modify operator opcode to add private flags
   (apply_opflags)

Still TODO:
 * More robustness testing, especially around new PL_expect value
 * Think about and test how actual chaining works with multiple of these
@leonerd leonerd force-pushed the undef-aware-equality branch from ac020e6 to f2536d2 Compare January 30, 2025 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants