Examples

CellSpeak Strings

The basic type for strings in CellSpeak is the array of bytes. The type zt, for zero-terminated is an intermediate type derived from the type byte[] and is defined in the platform package. Also there a number of methods and functions are defined to handle zero-terminated byte strings.

Based on that zero-terminated type several other string types are defined, like ascii, ansi and utf8, that take into account the encoding of the characters within a zero-terminated string.

The following examples are fairly straightforward and show how to work with strings in CellSpeak.

As for all arrays, strings that are assigned to are allocated automatically. Strings that are not part of the permanent data of a cell, are allocated on scratch-pad memory, which is reset automatically after the execution of each message handler. So in most of the cases allocation/de-allocation can simply be dispensed with.

(1) We do not have to allocate space for the string firs - we can simply assign.

(2) Expressions between [ and ] in a string are replaced by their stringified values.

(3) The string is one byte longer because of the newline character: \n = 1 byte


(4) Strings can span several lines. Leading white space is included in the string.


(5) You can use the special character /- to indicate that leading white space should not be included. This string will be alligned to the left of the window it is printed in.






(6) The original string-value in a is lost here. But we do not have to de-allocate and reallocate. a is a variable that is local to the constructor and is maintained on the scratchpad memory. When exiting the constructor, all scratchpad memory is released.

















(7) In stead of is you can also use the familiar == sign. The comparison function for strings is a byte-wise comparison.































(8) We can use slices to assign to parts of an array. i:1+3 will assign to four bytes in the array: i, i+1, i+2, i+3




























































































(9) The string is not copied because we use a reference assignment. Note that even if we re-assign the original string, this string will remain valid because it is allocated on the scratchpad memory. If we change the original string, the changes will also be visible in this string.

03 CellSpeak Strings.celsrc
use Windows, Math, Strings, Editor

-- We create one cell of which the constructor contains the examples
group StringHandling

design Demonstrator is

	-- Create the window cell for output (from the editor group)
	cell Window = create MenuWindow("String test")

	constructor is
	
		Window <- print("\nSome tests with strings")

	-. 1. Basics .-

		Window <- print("\n\n***Test1: basics\n")
		
(1)		ansi a = "Short string" 	-- a 12 byte string 
		
		-- print the string and its length. Note that quoatation marks inside the string are escaped with \	
(2)		Window <- print("\nThe string \"[a]\" is [a.len()] bytes long") 
(3)		a = "\nShort string" --13 bytes
		Window <- print("\nThe string \"[a]\" is [a.len()] bytes long - the new line character was added at the front") 
		Window <- print("\nThe string \"[a]\" is [a.lenz()] bytes long - if we include the terminating 0 byte.") 
		
	-. 2. Multiline Strings .-

		-- Multiline strings are also possible
		-- leading white space on a new line is normally not ignored, if you want to ignore it use the toggle \-
		Window <- print("\n\n***Test2: multi-line strings\n")
(4)		ansi b =   "
					This is a string
					which spans several lines
					and is alligned with the first line.
					Leading white space remains."
(5)		ansi b2 =   "\n\-This is also a string
					which spans several lines
					and is alligned with the first line.
					But here, leading white space is removed."
		Window <- print(b)		
		Window <- print(b2)		
		Window <- print("\n\nThe strings are [b.len()] and [b2.len()] bytes long")
		
	-. 3. String assignments .-

		-- You can re-assign a string to an existing string. 
		Window <- print("\n\n***Test3: re-assign a constant string to an existing string\n")
(6)		a = "\nThis is a slightly longer string then before."
		Window <- print(a)
		
		-- we can do the same with two variables
		a = "first"
		b = "second"
		
		-- print the current values of a and b
		Window <- print("\nNow a = '[a]' and b = '[b]' ")
		
		-- we swap a and b - note that array content is copied
		a,b = b,a	
		Window <- print("\nSwapped a and b -> now a = '[a]' and b = '[b]' ")
		
		-- swap without copying arrays - use a reference assignment
		a,b := b,a
		Window <- print("\nSwapped a and b again -> now a = '[a]' and b = '[b]' ")		
			
	-. 4. String comparisons .-

		Window <- print("\n\n***Test4: string comparisons: 0 means equal, negative earlier and positive later in the alphabet : a < b\n")
		a = "abcdef"
		b = "abcdghij"
		Window <- print("\ncompare([a],[b]) returns [compare(a,b)]")
		Window <- print("\ncompare([b],[a]) returns [compare(b,a)]")
		Window <- print("\ncompare([b],[b]) returns [compare(b,b)]")
		Window <- print("\ncompare([a],[a]) returns [compare(a,a)]")
		
(7)		if a is b then
			Window <- print("\n'[a]' and '[b]' are equal")
		else
			Window <- print("\n'[a]' and '[b]' are not equal")
		end
		
		-- We can also use the shorter construct:  expr ? action if true : action if false
		a is a ? Window <- print("\n'[a]' is always equal to itself !")
		a > b  ? Window <- print("\n'[a]' > '[b]'") : Window<-print("\n'[b]' > '[a]'") 
		
		-- cast to ansi is required - operation is only defined for ansi operands
		a is "abcdef" ?  Window <- print("\n'[a]' and 'abcdef' are indeed equal")
		
		if a is "abcdef" then
			Window <- print("\n'[a]' and 'abcdef' are equal")
		else 
			Window <- print("\n'[a]' and 'abcdef' are not equal")
		end

	-. 5. Modifying a string .-

		Window <- print("\n\n***Test5: Modifying a string\n")
		
		-- 0x58 is a byte - be careful with multi-byte character strings when doing this
		a[3] = 0x58
		
		-- We can also use a 'character' notation for a byte 
		a[4] = 0x'Y'
		
		-- as we want to print a[3] as a string also we put an escape before the [ - a[3] without it would print as a3
		Window <- print("\na\[3] and a\[4]  - the fourth and the fifth character - are changed in '[a]' ")	
		
		-- converting to upper / lower case - the method call will change the string
		-- note that lower and upper do not return the string, so we cannot use these directly in the format string
		a = "Kathy and Peter lived in Spain for 2 years."
		Window <- print("\nString:     '[a]'")
		a.upper() 
		Window <- print("\nUpper case: '[a]'")
		a.lower() 
		Window <- print("\nLower case: '[a]'")
		
		-- converting to upper / lower case - a itself remains unchanged
		-- This form of lower and upper return a string and can be used in the format string.
		a = "Bob and Alice share the same password :'1234' !"
		ansi c,d
		Window <- print("\nString:     '[a]'\nUpper case: '[ c = upper(a) ]'\nLower case: '[ d = lower(a) ]'")	
		
		-- Searching and replacing a sub-string
		var i = a.find("1234")		
(8)		a[i:i+3] = "abcd"
		Window <- print("\n[a] - much better now.")

		-- another example ...		
		a = "Neil Armstrong was the first man on the moon."
		Window <- print("\nFound 'A' at position [a.find(0x41)] in '[a]'")
		Window <- print("\nFound 'first' at position [a.find(\"first\")] in '[a]'.")		
		
	-. 6. Appending strings .-
			
		Window <- print("\n\n***Test6: Appending strings\n")
		
		-- there are several ways to append strings ...
		-- some are shorter more readable then others - what to use depends on the circumstances
		ansi Intro = "My name is"
		ansi Name = "Ozymandias"
		ansi Occupation = "king of kings"
		ansi Motto = "look on my works, ye Mighty, and despair!"
		
		-- 6.1 first method - create a formatted string - commas and spaces added where necessary - simple and sweet
			ansi First = "[Intro] [Name], [Occupation], [Motto]"
			Window <- print("\nFirst method (formatted string): [First]")
		
		-- 6.2 second method - use the append function for strings
			ansi Second
			Second.append(Intro)		
			Second.append(" ")		
			Second.append(Name)		
			Second.append(", ") 	
			Second.append(Occupation)
			Second.append(", ")
			Second.append(Motto)
			Window <- print("\nSecond method (append method): [Second]")
			
			-- as append returns an ansi string we can also do the following..
			Second = ""
			Second.append(Intro).append(" ").append(Name).append(", ").append(Occupation).append(", ").append(Motto)		
			Window <- print("\nSecond method (append chain): [Second]")
		
		-- 6.3 third method : we use the '+' operator we have defined (see strings.cellsrc)
			ansi Third = ""	
			Third = Intro + " " + Name + ", " + Occupation + ", " + Motto
			Window <- print("\nThird method (append operator): [Third]")
		
		-- 6.4 fourth method - using slices - a bit more cumbersome because of the position bookkeeping	
			var Il = Intro.len()
			var Nl = Name.len()
			var Ol = Occupation.len()
			var Ml = Motto.len()
			
			-- we use n to keep track of where the next part of the string has to come
			var n = 0
						
			-- get a string of sufficient length - the +6 in the size of the string is for  3 spaces, two commas and the terminating 0
			ansi[ Il + Nl + Ol + Ml + 6 ] Fourth

			Fourth = Intro 
			Fourth[n+=Il:] = " "
			Fourth[n+=1:] = Name
			Fourth[n+=Nl:]= ", "
			Fourth[n+=2:] = Occupation
			Fourth[n+=Ol:] = ", "
			Fourth[n+=2:] = Motto
			Window <- print("\nFourth method (array slices): [Fourth]")
			
	-. 7 Unicode .-
	
		Window <- print("\n\n***Test7:  Unicode\n")
	
		-- CellSpeak provides functions to work with encoding. Strings of different encodings can be defined and there are 
		-- functions to check the encoding of a string. Whether a string will be correctly displayed or not will depend on
		-- editor or software in general which is supposed to display the string. The internet is almost entirely utf8 and
		-- also many editors let you choose the encoding you want to use.
		
		utf8 u1 = "\nText using uncicode UTF-8 encoding."

		Window <- print(u1)
		
		-. utf8 encoding example:		
		α = U+0x3b1 = 0000'0011 1011'0001 -> 1100'1110 1011'0001 -> ce b1
					        yyy yyxx xxxx	    y yyyy   xx xxxx
		β = U+0x3b2 = 0000'0011 1011'0010 -> 1100'1110 1011'0010 -> ce b2
		γ = U+0x3b3 = 0000'0011 1011'0011 -> 1100'1110 1011'0011 -> ce b3		
		
		Greek alphabet: αβγδεζηθικλμνξοπρςστυφχψω
		.-
		
		-- we append the unicode codepoints for the greek characters
		-- The append function will convert the codepoint to its UTF-8 encoding
		utf8 Greek = "\nGreek alphabet: "
		for i=0 to 24 do
			Greek.append(0x000003b1 + i)
		end
		Window <- print(Greek)
		
		utf8 Math = "\nMath symbols: "
		for i=0 to 0x2F do
			Math.append(0x00002200 + i)
		end
		Window <- print(Math)
		
		utf8 ChineseProverb = "\nWise words indeed: 杀鸡儆猴"
		Window <- print(ChineseProverb)
		
		int α, β=2, γ=3
		α=β+γ
		Window <- print("\nα = β + γ -> [α] = [β] + [γ]")
		
	-. 8 Substrings .-
	
		Window <- print("\n\n***Test8:  Substring references\n")	
		
		-- By using the reference assignment and slices, strings can be set to substrings
		-- of other strings, ie without copying. The Mercury Seven (from wikipedia):
		utf8 M7 = 
		"
		The Mercury Seven were the group of seven Mercury astronauts announced by NASA on October 7, 1958.
		They are also referred to as the Original Seven or Astronaut Group 1. They piloted the manned 
		spaceflights of the Mercury program from May 1961 to May 1963. These seven original American 
		astronauts were Scott Carpenter, Gordon Cooper, John Glenn, Gus Grissom, Wally Schirra, 
		Alan Shepard, and Deke Slayton.

		Members of the group flew on all classes of NASA manned orbital spacecraft of the 20th century — 	
		Mercury, Gemini, Apollo, and the Space Shuttle. Gus Grissom died in 1967, in the Apollo 1 fire. 
		The others all survived past retirement from service. John Glenn went on to become a U.S. senator, 
		and flew on the Shuttle 36 years later to become the oldest person to fly in space. 
		He was the last living member of the class when he died in 2016"
		
		-- We print the original text 
		Window <- println(M7)
		
		-- a string where we will keep a reference to the names of the astronauts
		utf8 Astronauts		
		
		-- find where the first name starts
		var From = M7.find("Scott") 
		
		-- this creates an sub-array from 'Scott' to the end of the text - nothing has been copied
(9)		Astronauts := M7[ From: ]
		
		-- in this new array find the first '.' - note that the search starts at 'Scott'
		var To = Astronauts.find(0x'.')
		
		-- Now we make Astronauts a shorter sub array, by setting the end index. Note that  Astronauts := M7[From:To] would not work !
		Astronauts := Astronauts[:To] 		
				
		-- Because Astronauts is somewhere in the text, it is not 0 terminated, we change the ',' to 0
		Astronauts[To] = 0x00
		
		Window <- println("\nThe right stuff: [Astronauts]")
	
	end
end