Codebase list swi-prolog / debian/8.0.2+dfsg-3 man / extensions.doc
debian/8.0.2+dfsg-3

Tree @debian/8.0.2+dfsg-3 (Download .tar.gz)

extensions.doc @debian/8.0.2+dfsg-3raw · history · blame

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
\chapter{SWI-Prolog extensions}
\label{sec:extensions}

This chapter describes extensions to the Prolog language introduced with
SWI-Prolog version~7. The changes bring more modern syntactical
conventions to Prolog such as key-value maps, called \jargon{dicts} as
primary citizens and a restricted form of \jargon{functional notation}.
They also extend Prolog basic types with strings, providing a natural
notation to textual material as opposed to identifiers (atoms) and
lists.

These extensions make the syntax more intuitive to new users, simplify
the integration of domain specific languages (DSLs) and facilitate a
more natural Prolog representation for popular exchange languages such
as XML and JSON.

While many programs run unmodified in SWI-Prolog version~7, especially
those that pass double quoted strings to general purpose list processing
predicates require modifications. We provide a tool (list_strings/0)
that we used to port a huge code base in half a day.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Lists are special}
\label{sec:ext-lists}

As of version~7, SWI-Prolog lists can be distinguished unambiguously at
runtime from \functor{.}{2} terms and the atom \const{'[]'}. The
constant \verb$[]$ is special constant that is not an atom.  It has
the following properties:

\begin{code}
?- atom([]).
false.
?- atomic([]).
true.
?- [] == '[]'.
false.
?- [] == [].
true.
\end{code}

The `cons' operator for creating list cells has changed from the pretty
atom \verb$'.'$ to the ugly atom \verb$'[|]'$, so we can use the
\verb$'.'$ for other purposes.  See \secref{ext-dict-functions}.

This modification has minimal impact on typical Prolog code. It does
affect foreign code (see \secref{foreign}) that uses the normal atom and
compound term interface for manipulation lists. In most cases this can
be avoided by using the dedicated list functions. For convenience, the
macros \const{ATOM_nil} and \const{ATOM_dot} are provided by
\file{SWI-Prolog.h}.

Another place that is affected is write_canonical/1. Impact is minimized
by using the list syntax for lists.  The predicates read_term/2 and
write_term/2 support the option \term{dotlists}{true}, which causes
read_term/2 to read \verb$.(a,[])$ as \verb$[a]$ and write_term/2 to
write \verb$[a]$ as \verb$.(a,[])$.


\subsection{Motivating '\Scons' and \Snil{} for lists}
\label{sec:ext-list-motivation}

Representing lists the conventional way using \functor{.}{2} as
cons-cell and '[]' as list terminator both (independently) poses
conflicts, while these conflicts are easily avoided.

\begin{itemize}
    \item Using \functor{.}{2} prevents using this commonly used symbol
as an operator because \verb$a.B$ cannot be distinguished from \verb$[a|B]$.
Freeing \functor{.}{2} provides us with a unique term that we can use
for functional notation on dicts as described in
\secref{ext-dict-functions}.

    \item Using \verb$'[]'$ as list terminator prevents dynamic distinction
between atoms and lists. As a result, we cannot use type polymorphism
that involve both atoms and lists. For example, we cannot use
\jargon{multi lists} (arbitrary deeply nested lists) of atoms. Multi
lists of atoms are in some situations a good representation of a flat
list that is assembled from sub sequences. The alternative, using
difference lists or DCGs is often less natural and sometimes demands for
`opening' proper lists (i.e., copying the list while replacing the
terminating empty list with a variable) that have to be added to the
sequence.  The ambiguity of atom and list is particularly painful when
mapping external data representations that do not suffer from this
ambiguity.

At the same time, avoiding \verb$'[]'$ as a list terminator makes
the various text representations unambiguous, which allows us to write
predicates that require a textual argument to accept both atoms,
strings, and lists of character codes or one-character atoms.
Traditionally, the empty list can be interpreted both as the string "[]"
and "".
\end{itemize}

% ================================================================
\section{The string type and its double quoted syntax}
\label{sec:strings}

As of SWI-Prolog version~7, text enclosed in double quotes (e.g.,
\verb$"Hello world"$) is read as objects of the type \jargon{string}. A
string is a compact representation of a character sequence that lives on
the global (term) stack. Strings represent sequences of Unicode
characters including the character code 0 (zero). The length strings is
limited by the available space on the global (term) stack (see
set_prolog_stack/2). Strings are distinct from lists, which makes it
possible to detect them at runtime and print them using the string
syntax, as illustrated below:

\begin{code}
?- write("Hello world!").
Hello world!

?- writeq("Hello world!").
"Hello world!"
\end{code}

\jargon{Back quoted} text (as in \verb$`text`$) is mapped to a list of
character codes in version~7. The settings for the flags that control
how double and back quoted text is read is summarised in
\tabref{quote-mapping}. Programs that aim for compatibility should
realise that the ISO standard defines back quoted text, but does not
define the \prologflag{back_quotes} Prolog flag and does not define the
term that is produced by back quoted text.

\begin{table}
\begin{center}
\begin{tabular}{lcc}
\hline
\bf Mode & \prologflag{double_quotes} & \prologflag{back_quotes} \\
\hline
Version~7 default & string & codes \\
\cmdlineoption{--traditional} & codes & symbol_char \\
\hline
\end{tabular}
\end{center}
    \caption{Mapping of double and back quoted text in the two
	     modes.}
    \label{tab:quote-mapping}
\end{table}


\Secref{ext-dquotes-motivation} motivates the introduction of strings
and mapping double quoted text to this type.

\subsection{Predicates that operate on strings}
\label{sec:string-predicates}

Strings may be manipulated by a set of predicates that is similar to the
manipulation of atoms. In addition to the list below, string/1 performs
the type check for this type and is described in \secref{typetest}.

SWI-Prolog's string primitives are being synchronized with
\href{http://eclipseclp.org/wiki/Prolog/Strings}{ECLiPSe}. We expect the
set of predicates documented in this section to be stable, although it
might be expanded. In general, SWI-Prolog's text manipulation predicates
accept any form of text as input argument and produce the type indicated
by the predicate name as output. This policy simplifies migration and
writing programs that can run unmodified or with minor modifications on
systems that do not support strings. Code should avoid relying on this
feature as much as possible for clarity as well as to facilitate a more
strict mode and/or type checking in future releases.

\begin{description}
    \predicate{atom_string}{2}{?Atom, ?String}
Bi-directional conversion between an atom and a string. At
least one of the two arguments must be instantiated. \arg{Atom} can also
be an integer or floating point number.

    \predicate{number_string}{2}{?Number, ?String}
Bi-directional conversion between a number and a string. At least one of
the two arguments must be instantiated. Besides the type used to
represent the text, this predicate differs in several ways from its
ISO cousin:\footnote{Note that SWI-Prolog's syntax for numbers is not
ISO compatible either.}

    \begin{itemize}
	\item If \arg{String} does not represent a number, the
	      predicate \emph{fails} rather than throwing a syntax
	      error exception.
	\item Leading white space and Prolog comments are \emph{not}
	      allowed.
	\item Numbers may start with '+' or '-'.
	\item It is \emph{not} allowed to have white space between
	      a leading '+' or '-' and the number.
	\item Floating point numbers in exponential notation do not
	      require a dot before exponent, i.e., \verb$"1e10"$ is
	      a valid number.
    \end{itemize}

    \predicate{term_string}{2}{?Term, ?String}
Bi-directional conversion between a term and a string. If \arg{String}
is instantiated, it is parsed and the result is unified with \arg{Term}.
Otherwise \arg{Term} is `written' using the option \term{quoted}{true}
and the result is converted to \arg{String}.

    \predicate{term_string}{3}{?Term, ?String, +Options}
As term_string/2, passing \arg{Options} to either read_term/2
or write_term/2.  For example:

\begin{code}
?- term_string(Term, 'a(A)', [variable_names(VNames)]).
Term = a(_G1466),
VNames = ['A'=_G1466].
\end{code}

    \predicate{string_chars}{2}{?String, ?Chars}
Bi-directional conversion between a string and a list of characters
(one-character atoms). At least one of the two arguments must be
instantiated.

    \predicate{string_codes}{2}{?String, ?Codes}
Bi-directional conversion between a string and a list of character
codes. At least one of the two arguments must be instantiated.

    \predicate[det]{text_to_string}{2}{+Text, -String}
Converts \arg{Text} to a string.  \arg{Text} is an atom, string
or list of characters (codes or chars).	 When running in
\cmdlineoption{--traditional} mode, \verb$'[]'$ is ambiguous and
interpreted as an empty string.

    \predicate{string_length}{2}{+String, -Length}
Unify \arg{Length} with the number of characters in \arg{String}. This
predicate is functionally equivalent to atom_length/2 and also accepts
atoms, integers and floats as its first argument.

    \predicate{string_code}{3}{?Index, +String, ?Code}
True when \arg{Code} represents the character at the 1-based \arg{Index}
position in \arg{String}. If \arg{Index} is unbound the string is
scanned from index 1. Raises a domain error if \arg{Index} is negative.
Fails silently if \arg{Index} is zero or greater than the length of
\arg{String}. The mode \term{string_code}{-,+,+} is deterministic if the
searched-for \arg{Code} appears only once in \arg{String}.  See also
sub_string/5.

    \predicate{get_string_code}{3}{+Index, +String, -Code}
Semi-deterministic version of string_code/3. In addition, this version
provides strict range checking, throwing a domain error if \arg{Index}
is less than 1 or greater than the length of \arg{String}. ECLiPSe
provides this to support \verb$String[Index]$ notation.

    \predicate{string_concat}{3}{?String1, ?String2, ?String3}
Similar to atom_concat/3, but the unbound argument will be unified with
a string object rather than an atom. Also, if both \arg{String1} and
\arg{String2} are unbound and \arg{String3} is bound to text, it breaks
\arg{String3}, unifying the start with \arg{String1} and the end with
\arg{String2} as append does with lists. Note that this is not
particularly fast on long strings, as for each redo the system has to
create two entirely new strings, while the list equivalent only creates
a single new list-cell and moves some pointers around.

    \predicate[det]{split_string}{4}{+String, +SepChars, +PadChars, -SubStrings}
Break \arg{String} into \arg{SubStrings}. The \arg{SepChars} argument
provides the characters that act as separators and thus the length of
\arg{SubStrings} is one more than the number of separators found if
\arg{SepChars} and \arg{PadChars} do not have common characters. If
\arg{SepChars} and \arg{PadChars} are equal, sequences of adjacent
separators act as a single separator. Leading and trailing characters
for each substring that appear in \arg{PadChars} are removed from the
substring. The input arguments can be either atoms, strings or char/code
lists. Compatible with ECLiPSe. Below are some examples:

\begin{code}
% a simple split
?- split_string("a.b.c.d", ".", "", L).
L = ["a", "b", "c", "d"].
% Consider sequences of separators as a single one
?- split_string("/home//jan///nice/path", "/", "/", L).
L = ["home", "jan", "nice", "path"].
% split and remove white space
?- split_string("SWI-Prolog, 7.0", ",", " ", L).
L = ["SWI-Prolog", "7.0"].
% only remove leading and trailing white space
?- split_string("  SWI-Prolog  ", "", "\s\t\n", L).
L = ["SWI-Prolog"].
\end{code}

In the typical use cases, \arg{SepChars} either does not overlap
\arg{PadChars} or is equivalent to handle multiple adjacent separators
as a single (often white space). The behaviour with partially
overlapping sets of padding and separators should be considered
undefined.  See also read_string/5.

    \predicate{sub_string}{5}{+String, ?Before, ?Length, ?After, ?SubString}
\arg{SubString} is a substring of \arg{String}. There are \arg{Before}
characters in \arg{String} before \arg{SubString}, \arg{SubString}
contains \arg{Length} character and is followed by \arg{After}
characters in \arg{String}. If not enough information is provided to
compute the start of the match, \arg{String} is scanned left-to-right.
This predicate is functionally equivalent to sub_atom/5, but operates on
strings. The following example splits a string of the form
<name>=<value> into the name part (an atom) and the value (a string).

\begin{code}
name_value(String, Name, Value) :-
	sub_string(String, Before, _, After, "="), !,
	sub_string(String, 0, Before, _, NameString),
	atom_string(Name, NameString),
	sub_string(String, _, After, 0, Value).
\end{code}

    \predicate{atomics_to_string}{2}{+List, -String}
\arg{List} is a list of strings, atoms, integers or floating point
numbers. Succeeds if \arg{String} can be unified with the concatenated
elements of \arg{List}. Equivalent to \term{atomics_to_string}{List,
'', String}.

    \predicate{atomics_to_string}{3}{+List, +Separator, -String}
Creates a string just like atomics_to_string/2, but inserts
\arg{Separator} between each pair of inputs. For example:

\begin{code}
?- atomics_to_string([gnu, "gnat", 1], ', ', A).

A = "gnu, gnat, 1"
\end{code}

    \predicate{string_upper}{2}{+String, -UpperCase}
Convert \arg{String} to upper case and unify the result with
\arg{UpperCase}.

    \predicate{string_lower}{2}{+String, LowerCase}
Convert \arg{String} to lower case and unify the result with
\arg{LowerCase}.

    \predicate{read_string}{3}{+Stream, ?Length, -String}
Read at most \arg{Length} characters from \arg{Stream} and
return them in the string \arg{String}.  If \arg{Length} is
unbound, \arg{Stream} is read to the end and \arg{Length} is
unified with the number of characters read.

    \predicate{read_string}{5}{+Stream, +SepChars, +PadChars, -Sep, -String}
Read a string from \arg{Stream}, providing functionality similar to
split_string/4.  The predicate performs the following steps:

    \begin{enumerate}
    \item Skip all characters that match \arg{PadChars}
    \item Read up to a character that matches \arg{SepChars} or end of file
    \item Discard trailing characters that match \arg{PadChars} from
          the collected input
    \item Unify \arg{String} with a string created from the input and
          \arg{Sep} with the separator character read.  If input was
	  terminated by the end of the input, \arg{Sep} is unified
	  with -1.
    \end{enumerate}

The predicate read_string/5 called repeatedly on an input until
\arg{Sep} is -1 (end of file) is equivalent to reading the entire file
into a string and calling split_string/4, provided that \arg{SepChars}
and \arg{PadChars} are not \emph{partially
overlapping}.\footnote{Behaviour that is fully compatible would requite
unlimited look-ahead.}  Below are some examples:

\begin{code}
% Read a line
read_string(Input, "\n", "\r", End, String)
% Read a line, stripping leading and trailing white space
read_string(Input, "\n", "\r\t ", End, String)
% Read upto , or ), unifying End with 0', or 0')
read_string(Input, ",)", "\t ", End, String)
\end{code}

    \predicate{open_string}{2}{+String, -Stream}
True when \arg{Stream} is an input stream that accesses the content of
\arg{String}.  \arg{String} can be any text representation, i.e.,
string, atom, list of codes or list of characters.
\end{description}


\subsection{Representing text: strings, atoms and code lists}
\label{sec:text-representation}

With the introduction of strings as a Prolog data type, there are three
main ways to represent text: using strings, atoms or code lists. This
section explains what to choose for what purpose. Both strings and atoms
are \jargon{atomic} objects: you can only look inside them using
dedicated predicates. Lists of character codes are compound
datastructures.

\begin{description}
    \item [Lists of character codes]
is what you need if you want to \emph{parse} text using Prolog grammar
rules (DCGs, see phrase/3). Most of the text reading predicates (e.g.,
read_line_to_codes/2) return a list of character codes because most
applications need to parse these lines before the data can be processed.

    \item [Atoms]
are \emph{identifiers}. They are typically used in cases where identity
comparison is the main operation and that are typically not composed
nor taken apart. Examples are RDF resources (URIs that identify
something), system identifiers (e.g., \verb$'Boeing 747'$), but also
individual words in a natural language processing system. They are also
used where other languages would use \jargon{enumerated types}, such as
the names of days in the week. Unlike enumerated types, Prolog atoms do
not form not a fixed set and the same atom can represent different
things in different contexts.

    \item [Strings]
typically represents text that is processed as a unit most of the time,
but which is not an identifier for something.  Format specifications for
format/3 is a good example. Another example is a descriptive text
provided in an application.  Strings may be composed and decomposed
using e.g., string_concat/3 and sub_string/5 or converted for parsing
using string_codes/2 or created from codes generated by a generative
grammar rule, also using string_codes/2.
\end{description}


\subsection{Adapting code for double quoted strings}
\label{sec:ext-dquotes-port}

The predicates in this section can help adapting your program to the
new convention for handling double quoted strings. We have adapted a
huge code base with which we were not familiar in about half a day.

\begin{description}
    \predicate{list_strings}{0}{}
This predicate may be used to assess compatibility issues due to
the representation of double quoted text as string objects. See
\secref{strings} and \secref{ext-dquotes-motivation}.  To
use it, load your program into Prolog and run list_strings/0.  The
predicate lists source locations of string objects encountered in
the program that are not considered safe.  Such string need to be
examined manually, after which one of the actions below may be
appropriate:

\begin{itemize}
    \item Rewrite the code.  For example, change  \verb$[X] = "a"$
          into \verb$X = 0'a$.
    \item If a particular module relies heavily on representing
          strings as lists of character code, consider adding the
	  following directive to the module.  Note that this flag
	  only applies to the module in which it appears.

	  \begin{code}
	  :- set_prolog_flag(double_quotes, codes).
	  \end{code}
    \item Use a back quoted string (e.g., \verb$`text`$).  Note
	  that this will not make your code run regardless of
	  the \cmdlineoption{--traditional} command line option
	  and code exploiting this mapping is also not portable
	  to ISO compliant systems.
    \item If the strings appear in facts and usage is safe, add a
          clause to the multifile predicate check:string_predicate/1
	  to silence list_strings/0 on all clauses of that predicate.
    \item If the strings appear as an argument to a predicate that
          can handle string objects, add a clause to the multifile
	  predicate check:valid_string_goal/1 to silence list_strings/0.
\end{itemize}

    \predicate{check:string_predicate}{1}{:PredicateIndicator}
Declare that \arg{PredicateIndicator} has clauses that contain strings,
but that this is safe. For example, if there is a predicate
\nopredref{help_info}{2}, where the second argument contains a double
quoted string that is handled properly by the predicates of the
applications' help system, add the following declaration to stop
list_strings/0 from complaining:

\begin{code}
:- multifile check:string_predicate/1.

check:string_predicate(user:help_info/2).
\end{code}

    \predicate{check:valid_string_goal}{1}{:Goal}
Declare that calls to \arg{Goal} are safe.  The module qualification
is the actual module in which \arg{Goal} is defined.  For example, a
call to format/3 is resolved by the predicate system:format/3. and
the code below specifies that the second argument may be a string
(system predicates that accept strings are defined in the library).

\begin{code}
:- multifile check:valid_string_goal/1.

check:valid_string_goal(system:format(_,S,_)) :- string(S).
\end{code}
\end{description}


\subsection{Why has the representation of double quoted text changed?}
\label{sec:ext-dquotes-motivation}

Prolog defines two forms of quoted text. Traditionally, single quoted
text is mapped to atoms while double quoted text is mapped to a list of
\jargon{character codes} (integers) or characters represented as
1-character atoms. Representing text using atoms is often considered
inadequate for several reasons:

\begin{itemize}
    \item It hides the conceptual difference between text and
          program symbols.  Where content of text often matters because
	  it is used in I/O, program symbols are merely identifiers
	  that match with the same symbol elsewhere. Program symbols
	  can often be consistently replaced, for example to obfuscate
	  or compact a program.

    \item Atoms are globally unique identifiers.  They are stored
          in a shared table.  Volatile strings represented as atoms
	  come at a significant price due to the required cooperation
	  between threads for creating atoms. Reclaiming
	  temporary atoms using \jargon{Atom garbage collection} is a
	  costly process that requires significant synchronisation.

    \item Many Prolog systems (not SWI-Prolog) put severe restrictions
          on the length of atoms or the maximum number of atoms.
\end{itemize}

Representing text as a list of character codes or 1-character atoms
also comes at a price:

\begin{itemize}
    \item It is not possible to distinguish (at runtime) a list of
          integers or atoms from a string.  Sometimes this information
	  can be derived from (implicit) typing.  In other cases the
	  list must be embedded in a compound term to distinguish
	  the two types.  For example, \verb$s("hello world")$ could
	  be used to indicate that we are dealing with a string.

	  Lacking runtime information, debuggers and the toplevel can
	  only use heuristics to decide whether to print a list of
	  integers as such or as a string (see portray_text/1).

	  While experienced Prolog programmers have learned to cope
	  with this, we still consider this an unfortunate situation.

    \item Lists are expensive structures, taking 2 cells per character
          (3 for SWI-Prolog in its current form).  This stresses memory
	  consumption on the stacks while pushing them on the stack and
	  dealing with them during garbage collection is unnecessarilly
	  expensive.
\end{itemize}

We observe that in many programs, most strings are only handled as a
single unit during their lifetime. Examining real code tells us that
double quoted strings typically appear in one of the following roles:

\begin{description}
    \item [ A DCG literal ]  Although represented as a list of codes
is the correct representation for handling in DCGs, the DCG translator
can recognise the literal and convert it to the proper representation.
Such code need not be modified.

    \item [ A format string ]  This is a typical example of text that
is conceptually not a program identifier.  Format is designed to deal
with alternative representations of the format string.  Such code
need not be modified.

    \item [ Getting a character code ] The construct \verb$[X] = "a"$
is a commonly used template for getting the character code of the
letter 'a'.  ISO Prolog defines the syntax \verb$0'a$ for this purpose.
Code using this must be modified.  The modified code will run on any
ISO compliant processor.

    \item [ As argument to list predicates to operate on strings ]
Here, we see code such as \verb$append("name:", Rest, Codes)$.  Such
code needs to be modified.  In this particular example, the
following is a good portable alternative: \verb$phrase("name:", Codes, Rest)$

    \item [ Checks for a character to be in a set ]
Such tests are often performed with code such as this:
\verb.memberchk(C, "~!@#$").. This is a rather inefficient check in a
traditional Prolog system because it pushes a list of character codes
cell-by-cell the Prolog stack and then traverses this list
cell-by-cell to see whether one of the cells unifies with \arg{C}. If
the test is successful, the string will eventually be subject to garbage
collection.  The best code for this is to write a predicate as below,
which pushes noting on the stack and performs an indexed lookup to see
whether the character code is in `my_class'.

\begin{code}
my_class(0'~).
my_class(0'!).
...
\end{code}

An alternative to reach the same effect is to use term expansion to
create the clauses:

\begin{code}
term_expansion(my_class(_), Clauses) :-
	findall(my_class(C),
		string_code(_, "~!@#$", C),
		Clauses).

my_class(_).
\end{code}

Finally, the predicate string_code/3 can be exploited directly as a
replacement for the memberchk/2 on a list of codes. Although the string
is still pushed onto the stack, it is more compact and only a single
entity.
\end{description}

We offer the predicate list_strings/0 to help porting your program.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Syntax changes}
\label{sec:ext-syntax}

\subsection{Operators and quoted atoms}
\label{sec:ext-syntax-op}

As of SWI-Prolog version~7, quoted atoms loose their operator property.
This means that expressions such as \verb$A = 'dynamic'/1$ are valid
syntax, regardless of the operator definitions. From questions on the
mailinglist this is what people expect.\footnote{We believe that most
users expect an operator declaration to define a new token, which would
explain why the operator name is often quoted in the declaration, but
not while the operator is used. We are afraid that allowing for this
easily creates ambiguous syntax. Also, many development environments are
based on tokenization. Having dynamic tokenization due to operator
declarations would make it hard to support Prolog in such editors.} To
accomodate for real quoted operators, a quoted atom that \emph{needs}
quotes can still act as an operator.\footnote{Suggested by Joachim
Schimpf.} A good use-case for this is a unit
library\footnote{\url{https://groups.google.com/d/msg/comp.lang.prolog/ozqdzI-gi_g/2G16GYLIS0IJ}},
which allows for expressions such as below.

\begin{code}
?- Y isu 600kcal - 1h*200'W'.
Y = 1790400.0'J'.
\end{code}


\subsection{Compound terms with zero arguments}
\label{sec:ext-compound-zero}

As of SWI-Prolog version~7, the system supports compound terms that have
no arguments. This implies that e.g., \exam{name()} is valid syntax.
This extension aims at functions on dicts (see \secref{bidicts}) as well
as the implementation of domain specific languages (DSLs). To minimise
the consequences, the classic predicates functor/3 and \predref{=..}{2}
have not been modified. The predicates compound_name_arity/3 and
compound_name_arguments/3 have been added. These predicates operate only
on compound terms and behave consistently for compounds with zero
arguments. Code that \jargon{generalises} a term using the sequence
below should generally be changed to use compound_name_arity/3.

\begin{code}
    ...,
    functor(Specific, Name, Arity),
    functor(General, Name, Arity),
    ...,
\end{code}

Replacement of \predref{=..}{2} by compound_name_arguments/3 is
typically needed to deal with code that follow the skeleton below.

\begin{code}
    ...,
    Term0 =.. [Name|Args0],
    maplist(convert, Args0, Args),
    Term =.. [Name|Args],
    ...,
\end{code}

For predicates, goals and arithmetic functions (evaluable terms), <name>
and <name>() are \emph{equivalent}. Below are some examples that
illustrate this behaviour.

\begin{code}
go() :- format('Hello world~n').

?- go().
Hello world

?- go.
Hello world

?- Pi is pi().
Pi = 3.141592653589793.

?- Pi is pi.
Pi = 3.141592653589793.
\end{code}

Note that the \emph{cannonical} representation of predicate heads and
functions without arguments is an atom. Thus, \term{clause}{go(), Body}
returns the clauses for \nopredref{go}{0}, but \term{clause}{-Head,
-Body, +Ref} unifies \arg{Head} with an atom if the clause specified by
\arg{Ref} is part of a predicate with zero arguments.


\subsection{Block operators}
\label{sec:ext-blockop}

Introducing curly bracket and array subscripting.\footnote{Introducing
block operators was proposed by Jose Morales. It was discussed in the
Prolog standardization mailing list, but there were too many conflicts
with existing extensions (ECLiPSe and B-Prolog) and doubt about their
need to reach an agreement. Increasing need to get to some solution
resulted in what is documented in this section. These extensions are
also implemented in recent versions of YAP.} The symbols \verb$[]$ and
\verb${}$ may be declared as an operator, which has the following
effect:

\begin{description}
    \termitem{[~]}{}
This operator is typically declared as a low-priority \const{yf} postfix
operator, which allows for \verb$array[index]$ notation. This
syntax produces a term \verb$[]([index],array)$.

    \termitem{\{~\}}{}
This operator is typically declared as a low-priority \const{xf} postfix
operator, which allows for \verb$head(arg) { body }$ notation.  This
syntax produces a term \verb${}({body},head(arg))$.
\end{description}

Below is an example that illustrates the representation of a typical
`curly bracket language' in Prolog.

\begin{code}
?- op(100, xf, {}).
?- op(100, yf, []).
?- op(1100, yf, ;).

?- displayq(func(arg)
	    { a[10] = 5;
	      update();
	    }).
{}({;(=([]([10],a),5),;(update()))},func(arg))
\end{code}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Dicts: structures with named arguments}
\label{sec:bidicts}

SWI-Prolog version~7 introduces dicts as an abstract object with a
concrete modern syntax and functional notation for accessing members and
as well as access functions defined by the user. The syntax for a dict is
illustrated below. \arg{Tag} is either a variable or an atom. As with
compound terms, there is \textbf{no} space between the tag and the
opening brace. The keys are either atoms or small integers (up to
\prologflag{max_tagged_integer}). The values are arbitrary Prolog terms
which are parsed using the same rules as used for arguments in compound
terms.

\begin{quote}
Tag\{Key1:Value1, Key2:Value2, ...\}
\end{quote}

A dict can \emph{not} hold duplicate keys. The dict is transformed into
an opaque internal representation that does \emph{not} respect the order
in which the key-value pairs appear in the input text. If a dict is
written, the keys are written according to the standard order of terms
(see \secref{standardorder}). Here are some examples, where the second
example illustrates that the order is not maintained and the third
illustrates an anonymous dict.

\begin{code}
?- A = point{x:1, y:2}.
A = point{x:1, y:2}.

?- A = point{y:2, x:1}.
A = point{x:1, y:2}.

?- A = _{first_name:"Mel", last_name:"Smith"}.
A = _G1476{first_name:"Mel", last_name:"Smith"}.
\end{code}

Dicts can be unified following the standard symmetric Prolog unification
rules. As dicts use an internal canonical form, the order in which the
named keys are represented is not relevant. This behaviour is
illustrated by the following example.

\begin{code}
?- point{x:1, y:2} = Tag{y:2, x:X}.
Tag = point,
X = 1.
\end{code}

\textbf{Note} In the current implementation, two dicts unify only if
they have the same set of keys and the tags and values associated with
the keys unify. In future versions, the notion of unification between
dicts could be modified such that two dicts unify if their tags and the
values associated with \emph{common} keys unify, turning both dicts into
a new dict that has the union of the keys of the two original dicts.


\subsection{Functions on dicts}
\label{sec:ext-dict-functions}

The infix operator dot (\term{op}{100, yfx, .} is used to extract values
and evaluate functions on dicts. Functions are recognised if they appear
in the argument of a \jargon{goal} in the source text, possibly nested
in a term. The keys act as field selector, which is illustrated in this
example.

\begin{code}
?- X = point{x:1,y:2}.x.
X = 1.

?- Pt = point{x:1,y:2}, write(Pt.y).
2
Pt = point{x:1,y:2}.

?- X = point{x:1,y:2}.C.
X = 1,
C = x ;
X = 2,
C = y.
\end{code}

The compiler translates a goal that contains \functor{.}{2} terms in its
arguments into a conjunction of calls to \predref{.}{3} defined in the
\const{system} module. Terms functor{.}{2} that appears in the head are
replaced with a variable and calls to \predref{.}{3} are inserted at the
start of the body. Below are two examples, where the first extracts the
\const{x} key from a dict and the second extends a dict containing an
address with the postal code, given a \nopredref{find_postal_code}{4}
predicate.

\begin{code}
dict_x(X, X.x).

add_postal_code(Dict, Dict.put(postal_code, Code)) :-
	find_postal_code(Dict.city,
			 Dict.street,
			 Dict.house_number,
			 Code).
\end{code}

Note that expansion of \functor{.}{2} terms implies that such terms
cannot be created by writing them explicitly in your source code. Such
terms can still be created with functor/3, \predref{=..}{2},
compound_name_arity/3 and
compound_name_arguments/3.\footnote{Traditional code is unlikely to use
\functor{.}{2} terms because they were practically reserved for usage in
lists. We do not provide a quoting mechanism as found in functional
languages because it would only be needed to quote \functor{.}{2} terms,
such terms are rare and term manipulation provides an escape route.}

\begin{description}
    \predicate{.}{3}{+Dict, +Function, -Result}
This predicate is called to evaluate \functor{.}{2} terms found in the
arguments of a goal. This predicate evaluates the field extraction
described above, which is mapped to get_dict_ex/3. If \arg{Function} is a
compound term, it checks for the predefined functions on dicts described
in \secref{ext-dicts-predefined} or executes a user defined function as
described in \secref{ext-dict-user-functions}.
\end{description}


\subsubsection{User defined functions on dicts}
\label{sec:ext-dict-user-functions}

The tag of a dict associates the dict to a module.  If the dot
notation uses a compound term, this calls the goal below.

\begin{quote}
<module>:<name>(Arg1, ..., +Dict, -Value)
\end{quote}

Functions are normal Prolog predicates. The dict infrastructure provides
a more convenient syntax for representing the head of such predicates
without worrying about the argument calling conventions. The code below
defines a function \term{multiply}{Times} on a point that creates a new
point by multiplying both coordinates. and \term{len}{}\footnote{as
\term{length}{} would result in a predicate length/2, this name cannot
be used. This might change in future versions.} to compute the length
from the origin. The . and \verb$:=$ operators are used to abstract the
location of the predicate arguments. It is allowed to define multiple a
function with multiple clauses, providing overloading and
non-determinism.

\begin{code}
:- module(point, []).

M.multiply(F) := point{x:X, y:Y} :-
	X is M.x*F,
	Y is M.y*F.

M.len() := Len :-
	Len is sqrt(M.x**2 + M.y**2).
\end{code}

After these definitions, we can evaluate the following functions:

\begin{code}
?- X = point{x:1, y:2}.multiply(2).
X = point{x:2, y:4}.

?- X = point{x:1, y:2}.multiply(2).len().
X = 4.47213595499958.
\end{code}

\subsubsection{Predefined functions on dicts}
\label{sec:ext-dicts-predefined}

Dicts currently define the following reserved functions:

\begin{description}
    \dictfunction{get}{1}{?Key}
Same as \arg{Dict}.\arg{Key}, but maps to get_dict/3 instead of
get_dict_ex/3.  This implies that the function evaluation fails
silently if \arg{Key} does not appear in \arg{Dict}.  See also
\predref{:<}{2}, which can be used to test for existence and
unify multiple key values from a dict.  For example:

\begin{code}
?- write(t{a:x}.get(a)).
x
?- write(t{a:x}.get(b)).
false.
\end{code}

    \dictfunction{put}{1}{+New}
Evaluates to a new dict where the key-values in \arg{New} replace
or extend the key-values in the original dict.  See put_dict/3.

    \dictfunction{put}{2}{+KeyPath, +Value}
Evaluates to a new dict where the \arg{KeyPath}-\arg{Value} replaces or
extends the key-values in the original dict. \arg{KeyPath} is either a
key or a term \arg{KeyPath}/\arg{Key},\footnote{Note that we do not use
the '.' functor here, because the \functor{.}{2} would \emph{evaluate}.}
replacing the value associated with \arg{Key} in a sub-dict of the dict
on which the function operates. See put_dict/4. Below are some examples:

\begin{code}
?- A = _{}.put(a, 1).
A = _G7359{a:1}.

?- A = _{a:1}.put(a, 2).
A = _G7377{a:2}.

?- A = _{a:1}.put(b/c, 2).
A = _G1395{a:1, b:_G1584{c:2}}.

?- A = _{a:_{b:1}}.put(a/b, 2).
A = _G1429{a:_G1425{b:2}}.

?- A = _{a:1}.put(a/b, 2).
A = _G1395{a:_G1578{b:2}}.
\end{code}
\end{description}


\subsection{Predicates for managing dicts}
\label{sec:ext-dict-predicates}

This section documents the predicates that are defined on dicts.  We use
the naming and argument conventions of the traditional \pllib{assoc}.

\begin{description}
    \predicate{is_dict}{1}{@Term}
True if \arg{Term} is a dict.  This is the same as \exam{is_dict(Term,_)}.

    \predicate{is_dict}{2}{@Term, -Tag}
True if \arg{Term} is a dict of \arg{Tag}.

    \predicate{get_dict}{3}{?Key, +Dict, -Value}
Unify the value associated with \arg{Key} in dict with \arg{Value}.  If
\arg{Key} is unbound, all associations in \arg{Dict} are returned on
backtracking.  The order in which the associations are returned is
undefined.  This predicate is normally accessed using the functional
notation \exam{Dict.Key}.  See \secref{ext-dict-functions}.

Fails silently if Key does not appear in Dict.  This is different from
the behavior of the functional `.`-notation, which throws an existence
error in that case.

    \predicate{get_dict_ex}{3}{?Key, +Dict, -Value}
As get_dict/3, but throws an existence exception if \arg{Key} cannot be
found in dict.  This is used when evaluating \arg{Dict}.\arg{Key} as a
function.

    \predicate[semidet]{get_dict}{5}{+Key, +Dict, -Value, -NewDict, +NewValue}
Create a new dict after updating the value for \arg{Key}.  Fails if
\arg{Value} does not unify with the current value associated with
\arg{Key}.  \arg{Dict} is either a dict or a list the can be converted
into a dict.

Has the behavior as if defined in the following way:

\begin{code}
get_dict(Key, Dict, Value, NewDict, NewValue) :-
	get_dict(Key, Dict, Value),
	put_dict(Key, Dict, NewValue, NewDict).
\end{code}

    \predicate{dict_create}{3}{-Dict, +Tag, +Data}
Create a dict in \arg{Tag} from \arg{Data}. \arg{Data} is a list of
attribute-value pairs using the syntax \exam{Key:Value},
\exam{Key=Value}, \exam{Key-Value} or \exam{Key(Value)}. An exception is
raised if \arg{Data} is not a proper list, one of the elements is not of
the shape above, a key is neither an atom nor a small integer or there
is a duplicate key.

    \predicate{dict_pairs}{3}{?Dict, ?Tag, ?Pairs}
Bi-directional mapping between a dict and an ordered list of pairs
(see \secref{pairs}).

    \predicate{put_dict}{3}{+New, +DictIn, -DictOut}
\arg{DictOut} is a new dict created by replacing or adding key-value pairs
from \arg{New} to \arg{Dict}. \arg{New} is either a dict or a valid input
for dict_create/3. This predicate is normally accessed using the
functional notation. Below are some examples:

\begin{code}
?- A = point{x:1, y:2}.put(_{x:3}).
A = point{x:3, y:2}.

?- A = point{x:1, y:2}.put([x=3]).
A = point{x:3, y:2}.

?- A = point{x:1, y:2}.put([x=3,z=0]).
A = point{x:3, y:2, z:0}.
\end{code}

    \predicate{put_dict}{4}{+Key, +DictIn, +Value, -DictOut}
\arg{DictOut} is a new dict created by replacing or adding
\arg{Key}-\arg{Value} to \arg{DictIn}.  For example:

\begin{code}
?- A = point{x:1, y:2}.put(x, 3).
A = point{x:3, y:2}.
\end{code}

This predicate can also be accessed by using the functional notation,
in which case Key can also be a *path* of keys.  For example:

\begin{code}
?- Dict = _{}.put(a/b, c).
Dict = _6096{a:_6200{b:c}}.
\end{code}

    \predicate{del_dict}{4}{+Key, +DictIn, ?Value, -DictOut}
True when \arg{Key}-\arg{Value} is in \arg{DictIn} and \arg{DictOut}
contains all associations of \arg{DictIn} except for \arg{Key}.

    \infixop[semidet]{:<}{+Select}{+From}
True when \arg{Select} is a `sub dict' of \arg{From}: the tages
must unify and all keys in \arg{Select} must appear with unifying
values in \arg{From}.  \arg{From} may contain keys that are not in
\arg{Select}.  This operation is frequently used to \emph{match}
a dict and at the same time extract relevant values from it.
For example:

\begin{code}
plot(Dict, On) :-
	_{x:X, y:Y, z:Z} :< Dict, !,
	plot_xyz(X, Y, Z, On).
plot(Dict, On) :-
	_{x:X, y:Y} :< Dict, !,
	plot_xy(X, Y, On).
\end{code}

The goal \verb$Select :< From$ is equivalent to
\term{select_dict}{Select, From, _}.

    \predicate[semidet]{select_dict}{3}{+Select, +From, -Rest}
True when the tags of \arg{Select} and \arg{From} have been unified,
all keys in \arg{Select} appear in \arg{From} and the corresponding
values have been unified. The key-value pairs of \arg{From} that do not
appear in \arg{Select} are used to form an anonymous dict, which us
unified with \arg{Rest}.  For example:

\begin{code}
?- select_dict(P{x:0, y:Y}, point{x:0, y:1, z:2}, R).
P = point,
Y = 1,
R = _G1705{z:2}.
\end{code}

See also select_dict/2 to ignore \arg{Rest} and \predref{>:<}{2} for
a symmetric partial unification of two dicts.

    \infixop{>:<}{+Dict1}{+Dict2}
This operator specifies a \jargon{partial unification} between
\arg{Dict1} and \arg{Dict2}. It is true when the tags and the values
associated with all \emph{common} keys have been unified.  The values
associated to keys that do not appear in the other dict are ignored.
Partial unification is symmetric.  For example, given a list of dicts,
find dicts that represent a point with X equal to zero:

\begin{code}
    member(Dict, List),
    Dict >:< point{x:0, y:Y}.
\end{code}

See also \predref{:<}{2} and select_dict/3.
\end{description}


\subsubsection{Destructive assignment in dicts}
\label{sec:ext-dict-assignment}

This section describes the destructive update operations defined on
dicts. These actions can only \emph{update} keys and not add or remove
keys. If the requested key does not exist the predicate raises
\term{existence_error}{key, Key, Dict}. Note the additional argument.

Destructive assignment is a non-logical operation and should be used
with care because the system may copy or share identical Prolog terms
at any time. Some of this behaviour can be avoided by adding an
additional unbound value to the dict. This prevents unwanted sharing
and ensures that copy_term/2 actually copies the dict. This pitfall is
demonstrated in the example below:

\begin{code}
?- A = a{a:1}, copy_term(A,B), b_set_dict(a, A, 2).
A = B, B = a{a:2}.

?- A = a{a:1,dummy:_}, copy_term(A,B), b_set_dict(a, A, 2).
A = a{a:2, dummy:_G3195},
B = a{a:1, dummy:_G3391}.
\end{code}


\begin{description}
    \predicate[det]{b_set_dict}{3}{+Key, !Dict, +Value}
Destructively update the value associated with \arg{Key} in \arg{Dict} to
\arg{Value}. The update is trailed and undone on backtracking. This
predicate raises an existence error if \arg{Key} does not appear in
\arg{Dict}. The update semantics are equivalent to setarg/3 and
b_setval/2.

    \predicate[det]{nb_set_dict}{3}{+Key, !Dict, +Value}
Destructively update the value associated with \arg{Key} in \arg{Dict} to
a copy of \arg{Value}. The update is \emph{not} undone on backtracking.
This predicate raises an existence error if \arg{Key} does not appear in
\arg{Dict}. The update semantics are equivalent to nb_setarg/3 and
nb_setval/2.

    \predicate[det]{nb_link_dict}{3}{+Key, !Dict, +Value}
Destructively update the value associated with \arg{Key} in \arg{Dict} to
\arg{Value}. The update is \emph{not} undone on backtracking. This
predicate raises an existence error if \arg{Key} does not appear in
\arg{Dict}.  The update semantics are equivalent to nb_linkarg/3 and
nb_linkval/2. Use with extreme care and consult the documentation of
nb_linkval/2 before use.
\end{description}


\subsection{When to use dicts?}
\label{sec:ext-dicts-usage}

Dicts are a new type in the Prolog world. They compete with several other
types and libraries. In the list below we have a closer look at these
relations. We will see that dicts are first of all a good replacement for
compound terms with a high or not clearly fixed arity, library
\pllib{record} and option processing.

\begin{description}
    \item [Compound terms]
Compound terms with positional arguments form the traditional way to
package data in Prolog.  This representation is well understood, fast
and compound terms are stored efficiently.  Compound terms are still
the representation of choice, provided that the number of arguments is
low and fixed or compactness or performance are of utmost importance.

A good example of a compound term is the representation of RDF triples
using the term \term{rdf}{Subject, Predicate, Object} because RDF
triples are defined to have precisely these three arguments and they are
always referred to in this order. An application processing information
about persons should probably use dicts because the information that is
related to a person is not so fixed. Typically we see first and last
name. But there may also be title, middle name, gender, date of birth,
etc. The number of arguments becomes unmanagable when using a compound
term, while adding or removing an argument leads to many changes in the
program.

    \item [Library \pllib{record}]
Using library \pllib{record} relieves the maintenance issues associated
with using compound terms significantly.  The library generates access
and modification predicates for each field in a compound term from a
declaration.  The library provides sound access to compound terms with
many arguments.  One of its problems is the verbose syntax needed to
access or modify fields which results from long names for the generated
predicates and the restriction that each field needs to be extracted
with a separate goal.  Consider the example below, where the first uses
library \pllib{record} and the second uses dicts.

\begin{code}
    ...,
    person_first_name(P, FirstName),
    person_last_name(P, LastName),
    format('Dear ~w ~w,~n~n', [FirstName, LastName]).

    ...,
    format('Dear ~w ~w,~n~n', [Dict.first_name, Dict.last_name]).
\end{code}

Records have a fixed number of arguments and (non-)existence of an
argument must be represented using a value that is outside the normal
domain.  This lead to unnatural code.  For example, suppose our person
also has a title.  If we know the first name we use this and else we
use the title.  The code samples below illustrate this.

\begin{code}
salutation(P) :-
    person_first_name(P, FirstName), nonvar(FirstName), !,
    person_last_name(P, LastName),
    format('Dear ~w ~w,~n~n', [FirstName, LastName]).
salutation(P) :-
    person_title(P, Title), nonvar(Title), !,
    person_last_name(P, LastName),
    format('Dear ~w ~w,~n~n', [Title, LastName]).

salutation(P) :-
    _{first_name:FirstName, last_name:LastName} :< P, !,
    format('Dear ~w ~w,~n~n', [FirstName, LastName]).
salutation(P) :-
    _{title:Title, last_name:LastName} :< P, !,
    format('Dear ~w ~w,~n~n', [Title, LastName]).
\end{code}

    \item [Library \pllib{assoc}]
This library implements a balanced binary tree.  Dicts can replace
the use of this library if the association is fairly static (i.e.,
there are few update operations), all keys are atoms or (small)
integers and the code does not rely on ordered operations.

    \item [Library \pllib{option}]
Option lists are introduced by ISO Prolog, for example for read_term/3,
open/4, etc.  The \pllib{option} library provides operations to extract
options, merge options lists, etc.  Dicts are well suited to replace
option lists because they are cheaper, can be processed faster and
have a more natural syntax.

    \item [Library \pllib{pairs}]
This library is commonly used to process large name-value associations.
In many cases this concerns short-lived datastructures that result from
findall/3, maplist/3 and similar list processing predicates. Dicts may
play a role if frequent random key lookups are needed on the resulting
association. For example, the skeleton `create a pairs list', `use
list_to_assoc/2 to create an assoc', followed by frequent usage of
get_assoc/3 to extract key values can be replaced using dict_pairs/3
and the dict access functions. Using dicts in this scenario is more
efficient and provides a more pleasant access syntax.
\end{description}


\subsection{A motivation for dicts as primary citizens}
\label{sec:ext-dicts-motivation}

Dicts, or key-value associations, are a common data structure. A good old
example are \jargon{property lists} as found in Lisp, while a good
recent example is formed by JavaScript \jargon{objects}. Traditional
Prolog does not offer native property lists. As a result, people are
using a wide range of data structures for key-value associations:

\begin{itemize}
    \item Using compound terms and positional arguments, e.g.,
          \exam{point(1,2)}.
    \item Using compound terms with library \pllib{record}, which
	  generates access predicates for a term using positional
	  arguments from a description.
    \item Using lists of terms \exam{Name=Value}, \exam{Name-Value},
          \exam{Name:Value} or \exam{Name(Value)}.
    \item Using library \pllib{assoc} which represents the
          associations as a balanced binary tree.
\end{itemize}

This situation is unfortunate. Each of these have their advantages and
disadvantages. E.g., compound terms are compact and fast, but inflexible
and using positional arguments quickly breaks down. Library
\pllib{record} fixes this, but the syntax is considered hard to use.
Lists are flexible, but expensive and the alternative key-value
representations that are used complicate the matter even more. Library
\pllib{assoc} allows for efficient manipulation of changing
associations, but the syntactical representation of an assoc is complex,
which makes them unsuitable for e.g., \jargon{options lists} as seen in
predicates such as open/4.


\subsection{Implementation notes about dicts}
\label{sec:ext-dicts-implementation}

Although dicts are designed as an abstract data type and we deliberately
reserve the possibility to change the representation and even use
multiple representations, this section describes the current
implementation.

Dicts are currently represented as a compound term using the functor
\verb$`dict`$. The first argument is the tag. The remaining arguments
create an array of sorted key-value pairs. This representation is
compact and guarantees good locality. Lookup is order $\log{N}$, while
adding values, deleting values and merging with other dicts has order
$N$. The main disadvantage is that changing values in large dicts is
costly, both in terms of memory and time.

Future versions may share keys in a separate structure or use a binary
trees to allow for cheaper updates. One of the issues is that the
representation must either be kept cannonical or unification must be
extended to compensate for alternate representations.


% ================================================================
\section{Integration of strings and dicts in the libraries}
\label{sec:ext-integration}

While lacking proper string support and dicts when designed, many
predicates and libraries use interfaces that must be classified as
suboptimal. Changing these interfaces is likely to break much more code
than the changes described in this chapter. This section discusses some
of these issues. Roughly, there are two cases. There where key-value
associations or text is required as \emph{input}, we can facilitate the
new features by overloading the accepted types. Interfaces that produce
text or key-value associations as their \emph{output} however must make
a choice. We plan to resolve that using either options that specify the
desired output or provide an alternative library.


\subsection{Dicts and option processing}
\label{sec:ext-dict-options}

System predicates and predicates based on library \pllib{options}
process dicts as an alternative to traditional option lists.


\subsection{Dicts in core data structures}
\label{sec:ext-dict-in-core-data}

Some predicates now produce structured data using compound terms and
access predicates. We consider migrating these to dicts. Below is a
tentative list of candidates. Portable code should use the provided
access predicates and not rely on the term representation.

\begin{itemize}
    \item Stream position terms
    \item Date and time records
\end{itemize}


\subsection{Dicts, strings and XML}
\label{sec:ext-xml}

The XML representation could benefit significantly from the new
features. In due time we plan to provide an set of alternative
predicates and options to existing predicates that can be used to
exploit the new types. We propose the following changes to the data
representation:

\begin{itemize}
    \item The attribute list of the \term{element}{Name, Attributes, Content}
will become a dict.
    \item Attribute values will remain atoms
    \item CDATA in element content will be represented as strings
\end{itemize}

\subsection{Dicts, strings and JSON}
\label{sec:ext-json}

The JSON representation could benefit significantly from the new
features. In due time we plan to provide an set of alternative
predicates and options to existing predicates that can be used to
exploit the new types. We propose the following changes to the data
representation:

\begin{itemize}
    \item Instead of using \term{json}{KeyValueList}, the new
interface will translate JSON objects to a dict.  The type of
this dict will be \const{json}.

    \item String values in JSON will be mapped to strings.

    \item The values \const{true}, \const{false} and \const{null}
will be represented as atoms.
\end{itemize}


\subsection{Dicts, strings and HTTP}
\label{sec:ext-http}

The HTTP library and related data structures would profit from
exploiting dicts.  Below is a list of data structures that might
be affected by future changes.	 Code can be made more robust
by using the \pllib{option} library functions for extracting
values from these structures.

\begin{itemize}
    \item The HTTP request structure
    \item The HTTP parameter interface
    \item URI components
    \item Attributes to HTML elements
\end{itemize}


%================================================================
\section{Remaining issues}
\label{sec:ext-issues}

The changes and extensions described in this chapter resolve a many
limitations of the Prolog language we have encountered. Still, there are
remaining issues for which we seek solutions in the future.

\paragraph{Text representation}

Although strings resolve this issue for many applications, we are still
faced with the representation of text as lists of characters which we
need for parsing using DCGs. The ISO standard provides two
representations, a list of \jargon{character codes} (`codes' for short)
and a list of \jargon{one-character atoms} (`chars' for short). There
are two sets of predicates, named *_code(s) and *_char(s) that provide
the same functionality (e.g., atom_codes/2 and atom_chars/2) using their
own representation of characters. Codes can be used in arithmetic
expressions, while chars are more readable. Neither can unambiguously be
interpreted as a representation for text because codes can be
interpreted as a list of integers and chars as a list of atoms.

We have not found a convincing way out. One of the options could be the
introduction of a `char' type. This type can be allowed in arithmetic
and with the 0'<char> syntax we have a concrete syntax for it.


\paragraph{Arrays}

Although lists are generally a much cleaner alternative for Prolog, real
arrays with direct access to elements can be useful for particular
tasks. The problem of integrating arrays is twofold. First of all, there
is no good one-size-fits-all data representation for arrays. Many tasks
that involve arrays require \jargon{mutable} arrays, while Prolog data
is immutable by design. Second, standard Prolog has no good syntax
support for arrays. SWI-Prolog version~7 has `block operators' (see
\secref{ext-blockop}) which can resolve the syntactic issues. Block
operators have been adopted by YAP.


\paragraph{Lambda expressions}

Although many alternatives\footnote{See e.g.,
\url{http://www.complang.tuwien.ac.at/ulrich/Prolog-inedit/ISO-Hiord}}
have been proposed, we still feel uneasy with them.


\paragraph{Loops}

Many people have explored routes to avoid the need for recursion in
Prolog for simple iterations over data. ECLiPSe have proposed
\jargon{logical loops} \cite{logicalloops:2002}, while B-Prolog
introduced \jargon{declarative loops} and \jargon{list
comprehension}\footnote{\url{http://www.probp.com/download/loops.pdf}}.
The above mentioned lambda expressions, combined with maplist/2 can
achieve similar results.