as2js: as2js::String Class Reference

#include <string.h>

Inheritance diagram for as2js::String:
Collaboration diagram for as2js::String:
Public Types enum  conversion_result_t {
  conversion_result_t::STRING_GOOD = 0,
  conversion_result_t::STRING_END = -1,
  conversion_result_t::STRING_BAD = -2,
  conversion_result_t::STRING_INVALID = -3
}   Public Member Functions  String ()  Initialize an empty string. More...
   String (char const *str, int len=-1)  Create a string from the specified input string. More...
   String (wchar_t const *str, int len=-1)  Create a string from the specified input string. More...
   String (as_char_t const *str, int len=-1)  Create a string from the specified input string. More...
   String (std::string const &str)  Copy str in this String. More...
   String (std::wstring const &str)  Copy str in this String. More...
   String (std::basic_string< as_char_t > const &str)  Copy str in this String. More...
  conversion_result_t from_as_char (as_char_t const *str, int len=-1)  Copy an as_char_t string to this String. More...
  conversion_result_t from_char (char const *str, int len=-1)  Concatenate a String and a C-string. More...
  conversion_result_t from_utf8 (char const *str, int len=-1)  Copy a UTF-8 string to this String. More...
  conversion_result_t from_utf8 (std::string const &str)  Copy a UTF-8 string to this String. More...
  conversion_result_t from_wchar (wchar_t const *str, int len=-1)  Copy a wchar_t string to this String. More...
  bool is_float64 () const  Check whether the string represents a valid floating pointer number. More...
  bool is_int64 () const  Check whether this string represents a valid integer. More...
  bool is_number () const  Check whether this string represents a number. More...
  bool is_true () const  Check whether the string is considered true. More...
  bool operator!= (char const *str) const  Compare this String against a char const * string. More...
  Stringoperator+= (char const *str)  Append str to this String. More...
  Stringoperator+= (wchar_t const *str)  Append str to this String. More...
  Stringoperator+= (as_char_t const *str)  Append str to this String. More...
  Stringoperator+= (std::string const &str)  Append str to this String. More...
  Stringoperator+= (std::wstring const &str)  Append str to this String. More...
  Stringoperator+= (std::basic_string< as_char_t > const &str)  Append str to this String. More...
  Stringoperator+= (as_char_t const c)  Append c to this String. More...
  Stringoperator+= (char const c)  Append c to this String. More...
  Stringoperator+= (wchar_t const c)  Append c to this String. More...
  Stringoperator= (char const *str)  Copy str in this String. More...
  Stringoperator= (wchar_t const *str)  Copy str in this String. More...
  Stringoperator= (as_char_t const *str)   Stringoperator= (std::string const &str)  Copy str in this String. More...
  Stringoperator= (std::wstring const &str)  Copy str in this String. More...
  Stringoperator= (std::basic_string< as_char_t > const &str)  Copy str in this String. More...
  bool operator== (char const *str) const  Compare this String against a char const * string. More...
  String simplified () const  Make a simplified copy of this string. More...
  Float64::float64_type to_float64 () const  Convert a string to a floating point number. More...
  Int64::int64_type to_int64 () const  Convert a string to an integer number. More...
  std::string to_utf8 () const  Convert a string to UTF-8 and return the result. More...
  ssize_t utf8_length () const  Calculate the length if converted to UTF-8. More...
  bool valid () const  Check validity of the string. More...
  Static Public Member Functions static bool valid_character (as_char_t c)  Check whether a character is considered valid. More...
  Static Public Attributes static as_char_t const STRING_BOM = 0xFEFF   static as_char_t const STRING_CONTINUATION = -2   Friends bool operator!= (char const *str, String const &string)  Compare a String against a char const * string. More...
  bool operator== (char const *str, String const &string)  Compare a String against a char const * string. More...
  Detailed Description

Definition at line 51 of file string.h.

Member Enumeration Documentation
EnumeratorSTRING_GOOD  STRING_END  STRING_BAD  STRING_INVALID 

Definition at line 60 of file string.h.

Constructor & Destructor Documentation
as2js::String::String ( )

This function initializes an empty string.

Definition at line 60 of file string.cpp.

Referenced by from_wchar().

as2js::String::String ( char const *  str, int  len = -1  )

This function creates a string and initializes it with the specified input string.

The input is considered to be ISO-8859-1 and thus it gets copied in the string as is (see the from_char() function.) If you have UTF-8 data, make sure to use the from_utf8() function instead.

Note that we cannot include '\0' characters in our strings. This function stops at the first null terminator no matter what.

Note
The str pointer can be set to nullptr in which case the string is considered empty.
Parameters
[in]strA string, if not null terminated, make sure to define the len parameter. [in]lenThe length of the string, if -1, expect a '\0'.
See also
from_utf8()
from_char()

Definition at line 89 of file string.cpp.

References from_char().

as2js::String::String ( wchar_t const *  str, int  len = -1  )

This function creates a string and initializes it with the specified input string.

The input is considered to be UTF-32 or UTF-16 depending on the width of the wchar_t type.

Note that we cannot include '\0' characters in our strings. This function stops at the first null terminator no matter what.

Note
The str pointer can be set to nullptr in which case the string is considered empty.
Parameters
[in]strA string, if not null terminated, make sure to define the len parameter. [in]lenThe length of the string, if -1, expect a '\0'.
See also
from_wchar()

Definition at line 117 of file string.cpp.

References from_wchar().

as2js::String::String ( as_char_t const *  str, int  len = -1  )

This function creates a string and initializes it with the specified input string.

The input is considered to be UTF-32 and thus it gets copied as is.

Note that we cannot include '\0' characters in our strings. This function stops at the first null terminator no matter what.

Note
The str pointer can be set to nullptr in which case the string is considered empty.
Parameters
[in]strA string, if not null terminated, make sure to define the len parameter. [in]lenThe length of the string, if -1, expect a '\0'.
See also
from_as_char()

Definition at line 144 of file string.cpp.

References from_as_char(), and STRING_GOOD.

as2js::String::String ( std::string const &  str)

This function copies str in this String.

The input is considered to be ISO-8859-1 and thus it gets copied in the string as is (see the from_char() function.) If you have UTF-8 data, make sure to use the from_utf8() function instead.

Parameters
[in]strThe input string to copy in this String.
See also
from_char()

Definition at line 166 of file string.cpp.

References from_char().

as2js::String::String ( std::wstring const &  str)

This function copies str in this String.

The input string is taken as UTF-16 if wchar_t is 2 bytes and as such converts the surrogates (0xD800 to 0xDFFF) to UTF-32 characters as expected. If wchar_t is 4 bytes, the string is copied as is.

Parameters
[in]strThe input string to copy in this String.
See also
from_wchar()

Definition at line 185 of file string.cpp.

References from_wchar().

as2js::String::String ( std::basic_string< as_char_t > const &  str)

This function copies str in this String.

The input string is taken as UTF-32 and copied as is in its entirety.

Parameters
[in]strThe input string to copy in this String.

Definition at line 200 of file string.cpp.

Member Function Documentation
String::conversion_result_t as2js::String::from_as_char ( as_char_t const *  str, int  len = -1  )

This function copies an as_char_t string to this String. Since an as_char_t string has the same character type as a String, this copy is straight forward.

The copy stops as soon as a null ('\0') character is found.

Note
If an error occurs, this String object is not modified.
This function can be called with a nullptr in str, in which case the string is considered empty.
Parameters
[in]strThe input string to copy in this string. [in]lenThe maximum number of characters to copy, if -1, copy up to the next null ('\0') character.
Returns
STRING_INVALID: if the resulting character is not a valid UTF-32 character, STRING_GOOD: the new string is valid.

Definition at line 733 of file string.cpp.

References STRING_GOOD, STRING_INVALID, and valid_character().

Referenced by String().

String::conversion_result_t as2js::String::from_char ( char const *  str, int  len = -1  )

This function concatenate this String and a standard C-string.

Note
This function creates a copy of the string. If you can, try to use the += operator instead.
Parameters
[in]strThe string to concatenate at the end of this String.
Returns
A new string with the concatenated result.Concatenate a String and a wide C-string.

This function concatenate this String and a standard wide C-string.

Note
This function creates a copy of the string. If you can, try to use the += operator instead.
Parameters
[in]strThe string to concatenate at the end of this String.
Returns
A new string with the concatenated result.Concatenate a String and a C-like string made of as_char_t characters.

This function concatenate this String and a C-link string made of as_char_t characters. The array must be null terminated (\0).

Note
This function creates a copy of the string. If you can, try to use the += operator instead.
Parameters
[in]strThe string to concatenate at the end of this String.
Returns
A new string with the concatenated result.Concatenate a String and a C++ string.

This function concatenate this String and a C++ string.

Note
This function creates a copy of the string. If you can, try to use the += operator instead.
Parameters
[in]strThe string to concatenate at the end of this String.
Returns
A new string with the concatenated result.Concatenate a String and a C++ wide string.

This function concatenate this String and a C++ wide string.

Note
This function creates a copy of the string. If you can, try to use the += operator instead.
Parameters
[in]strThe string to concatenate at the end of this String.
Returns
A new string with the concatenated result.Copy a string in this String object.

This function copies the ISO-8859-1 string pointer by str to this string. The previous string is lost.

If a null character is found, the copy stops.

The len parameter can be used to limit the length of the copy.

Note
This function can be called with a nullptr in str, in which case the string is considered empty.
Parameters
[in]strThe input string to copy in this string. [in]lenThe maximum number of characters to copy, if -1, copy up to the next null ('\0') character.
Returns
Always STRING_GOOD since all bytes in ISO-8859-1 are all valid Unicode characters.

Definition at line 588 of file string.cpp.

References STRING_GOOD.

Referenced by operator=(), and String().

String::conversion_result_t as2js::String::from_utf8 ( char const *  str, int  len = -1  )

This function copies a string to this String. In this case the input string is considered to be UTF-8.

If you have an std::string, use the c_str() operation to call this function.

The copy stops as soon as a null ('\0') character is found.

Note
If an error occurs, this String object is not modified.
Parameters
[in]strThe input string to copy in this string. [in]lenThe maximum number of characters to copy, if -1, copy up to the next null ('\0') character.
Returns
STRING_INVALID: if the resulting character is not a valid UTF-32 character, STRING_BAD: if the input is invalid, STRING_END: could not be converted (not enough data for last character), STRING_GOOD: the new string is valid.

Definition at line 792 of file string.cpp.

References STRING_BAD, STRING_END, STRING_GOOD, STRING_INVALID, and valid_character().

Referenced by from_utf8(), as2js::DecodingFilterDetect::get_char(), as2js::rc_t::get_home(), as2js::Compiler::get_package_filename(), and as2js::rc_t::init_rc().

String::conversion_result_t as2js::String::from_utf8 ( std::string const &  str)

See the fromt_utf8(char const * str, int len = -1) implementation for additional information.

Parameters
[in]strThe input string to copy in this string.
Returns
The conversion status.

Definition at line 891 of file string.cpp.

References from_utf8().

String::conversion_result_t as2js::String::from_wchar ( wchar_t const *  str, int  len = -1  )

This function copies a wchar_t string to this String. Internally we only deal with UTF-32 characters. However, this function expects the input to possibly be UTF-16 and converts surrogate characters to UTF-32 as expected in UTF-16. (In other words, this functions works under Linux and MS-Windows.)

Note
This string is not modified if the input is not valid.
This function can be called with a nullptr in str, in which case the string is considered empty.
Parameters
[in]strThe input string to copy in this string. [in]lenThe maximum number of characters to copy, if -1, copy up to the next null ('\0') character.
Returns
STRING_INVALID: if a character is not a valid UTF-32 character, STRING_BAD: if the input is invalid, STRING_END: could not be converted (not enough data for last surrogate character), STRING_GOOD: the new string is valid.

Definition at line 638 of file string.cpp.

References String(), STRING_BAD, STRING_END, STRING_GOOD, and STRING_INVALID.

Referenced by as2js::Message::operator<<(), operator=(), and String().

bool as2js::String::is_float64 ( ) const

This function parses the string to see whether it represents a valid floating pointer number: an integral part, an optional decimal part, and an optional signed exponent.

The sign of the exponent is also itself optional.

Note that this function returns true if the number is an integer in decimal number representation, however, it will return false for hexadecimal numbers. You may also call the is_number() function to know whether a string represents either a decimal number or a floating point number.

  • A floating point number: [-+]?[0-9]+(.[0-9]+)?([eE]?[0-9]+)?
Todo:
Ameliorate the test so if no digits are present where required then an error is emitted (i.e. you may have '0.', '.0' but not just '.'; same problem with exponent).
Returns
true if the string represents a floating point number.

Definition at line 1105 of file string.cpp.

Referenced by is_number(), to_float64(), and as2js::Node::to_int64().

bool as2js::String::is_int64 ( ) const

This function checks the strings to see whether it represents a valid integer. The function supports decimal and hexadecimal numbers. Octals are not supported because JavaScript does not convert numbers that start with a 0 as if these were octal numbers.

  • Decimal number: [-+]?[0-9]+
  • Hexadecimal number: [-+]?0[xX][0-9a-fA-F]+
Returns
true if the string represents an integer.

Definition at line 1037 of file string.cpp.

Referenced by is_number(), to_int64(), and as2js::Node::to_int64().

bool as2js::String::is_number ( ) const

This function checks whether this string represents a number. This means it returns true in the following cases:

  • The string represents a decimal number ([-+]?[0-9]+)
  • The string represents an hexadecimal number ([-+]?0[xX][0-9a-fA-F]+)
  • The string represents a floating point number ([-+]?[0-9]+(.[0-9]+)?([eE]?[0-9]+)?)

Unfortunately, JavaScript does not understand "true", "false", and "null" as numbers (even though isNaN(true), isNaN(false), and isNaN(null) all return true.)

Returns
true if this string represents a valid number

Definition at line 1155 of file string.cpp.

References is_float64(), and is_int64().

bool as2js::String::is_true ( ) const

A string that is empty is considered false. Any other string is considered true.

Returns
true if the string is not empty.

Definition at line 1253 of file string.cpp.

Referenced by as2js::Node::to_boolean(), and as2js::Node::to_boolean_type_only().

bool as2js::String::operator!= ( char const *  str) const

This function compares an ISO-8859-1 string against this String. If you have a UTF-8 string, make sure to use from_utf8() first and then compare the two String's against each other.

Parameters
[in]strThe string to compare as ISO-8859-1.
Returns
true if both strings are not equal.

Definition at line 942 of file string.cpp.

String & as2js::String::operator+= ( char const *  str)

This function appends str to this String. The string is viewed as ISO-8859-1. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.

Definition at line 309 of file string.cpp.

String & as2js::String::operator+= ( wchar_t const *  str)

This function appends str to this String. The string is viewed as UTF-16 if wchar_t is 2 bytes, and UTF-32 if wchar_t is 4 bytes. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.
See also
from_wchar()

Definition at line 329 of file string.cpp.

String & as2js::String::operator+= ( as_char_t const *  str)

This function appends str to this String. The string is viewed as UTF-32. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.

Definition at line 347 of file string.cpp.

String & as2js::String::operator+= ( std::string const &  str)

This function appends str to this String. The string is viewed as ISO-8859-1. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.

Definition at line 364 of file string.cpp.

String & as2js::String::operator+= ( std::wstring const &  str)

This function appends str to this String. The string is viewed as UTF-16 if wchar_t is 2 bytes, and UTF-32 if wchar_t is 4 bytes. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.

Definition at line 382 of file string.cpp.

String & as2js::String::operator+= ( std::basic_string< as_char_t > const &  str)

This function append str to this String. The string is viewed as UTF-32. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to append to this String.
Returns
A reference to this string.

Definition at line 400 of file string.cpp.

String & as2js::String::operator+= ( as_char_t const  c)

This function append c to this String. The character is viewed as UTF-32. If another format is expected, make sure to use the proper function.

Parameters
[in]cThe character to append to this String.
Returns
A reference to this string.

Definition at line 417 of file string.cpp.

String & as2js::String::operator+= ( char const  c)

This function append c to this String. The character is viewed as ISO-8859-1. If another format is expected, make sure to use the proper function.

Parameters
[in]cThe character to append to this String.
Returns
A reference to this string.

Definition at line 434 of file string.cpp.

String & as2js::String::operator+= ( wchar_t const  c)

This function append c to this String. The character is viewed as UTF-32. If another format is expected, make sure to use the proper function.

Todo:
Under MS-Windows the character is viewed as UTF-16, only we do not properly manage surrogates in this case (i.e. if you just added another surrogate, concatenate both surrogates in one UTF-32 character.)
Parameters
[in]cThe character to append to this String.
Returns
A reference to this string.

Definition at line 457 of file string.cpp.

References valid_character().

String & as2js::String::operator= ( char const *  str)

This function copies str in this String. The string is viewed as ISO-8859-1. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to copy in this String.
Returns
A reference to this string.
See also
from_char()

Definition at line 218 of file string.cpp.

References from_char().

String & as2js::String::operator= ( wchar_t const *  str)

This function copies str in this String. The string is viewed as UTF-16 if wchar_t is 2 bytes, and UTF-32 if wchar_t is 4 bytes. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to copy in this String.
Returns
A reference to this string.
See also
from_wchar()

Definition at line 237 of file string.cpp.

References from_wchar().

String& as2js::String::operator= ( as_char_t const *  str)
String & as2js::String::operator= ( std::string const &  str)

This function copies str in this String. The string is viewed as ISO-8859-1. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to copy in this String.
Returns
A reference to this string.
See also
from_char()

Definition at line 256 of file string.cpp.

References from_char().

String & as2js::String::operator= ( std::wstring const &  str)

This function copies str in this String. The string is viewed as UTF-16 if wchar_t is 2 bytes, and UTF-32 if wchar_t is 4 bytes. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to copy in this String.
Returns
A reference to this string.
See also
from_wchar()

Definition at line 275 of file string.cpp.

References from_wchar().

String & as2js::String::operator= ( std::basic_string< as_char_t > const &  str)

This function copies str in this String. The string is viewed as UTF-32. If another format is expected, make sure to use the proper function.

Parameters
[in]strThe string to copy in this String.
Returns
A reference to this string.

Definition at line 292 of file string.cpp.

bool as2js::String::operator== ( char const *  str) const

This function compares an ISO-8859-1 string against this String. If you have a UTF-8 string, make sure to use from_utf8() first and then compare the two String's against each other.

Parameters
[in]strThe string to compare as ISO-8859-1.
Returns
true if both strings are equal.

Definition at line 907 of file string.cpp.

String as2js::String::simplified ( ) const

This function makes a copy of this string while removing spaces from the start, the end, and within the string keep a single space.

If the string starts with a number, then only the number is kept.

Note
This function is primarily used to compare a string using the smart match operator.
Returns
The simplified string.

Definition at line 1400 of file string.cpp.

Float64::float64_type as2js::String::to_float64 ( ) const

This function verifies that the string represents a valid floating point number, if so, it converts it to such and returns the result.

If the string does not represent a valid floating point, then the function returns NaN.

Warning
On an empty string, this function returns 0.0 and not NaN as expected in JavaScript.
Note
When used by the lexer, it should always work since the lexer reads floating points with the same expected syntax.
Returns
The string as a floating point.

Definition at line 1230 of file string.cpp.

References is_float64(), and to_utf8().

Referenced by as2js::Lexer::read_number(), as2js::Node::to_float64(), as2js::Node::to_int64(), and as2js::Node::to_number().

Int64::int64_type as2js::String::to_int64 ( ) const

This function verifies that the string represents a valid integer number, if so, it converts it to such and returns the result.

If the string does not represent a valid integer, then the function should return NaN. Unfortunately, there is not NaN integer. Instead it will return zero (0) or it will raise an exception.

Note
When used by the lexer, it should always work since the lexer reads integers with the same expected syntax.
Exceptions
exception_internal_errorThe string is not empty and it does not represent what is considered a valid JavaScript integer.
Returns
The string converted to an integer.

Definition at line 1181 of file string.cpp.

References is_int64(), and to_utf8().

Referenced by as2js::Node::to_int64().

std::string as2js::String::to_utf8 ( ) const

This function converts this String in UTF-8 using an std::string and then returns the result.

Warning
Remember that you cannot use a UTF-8 as direct input of a constructor or assignment operator of the String class. Instead, make sure to use the from_utf8() function.
Note
The function skips any character considered invalid. If you want to know whether the resulting UTF-8 string is an exact representation of this String, then first call the valid() function on the source.
Todo:
This String object is expected to not have any invalid characters so this function always returns the conversion even if it finds invalid characters.
Returns
The String converted to UTF-8 and saved in an std::string.

Definition at line 1343 of file string.cpp.

References valid_character().

Referenced by as2js::StandardOutput::internal_write(), as2js::FileOutput::internal_write(), as2js::Compiler::load_internal_packages(), as2js::FileInput::open(), as2js::FileOutput::open(), as2js::Message::operator<<(), as2js::operator<<(), as2js::Lexer::read_number(), to_float64(), and to_int64().

ssize_t as2js::String::utf8_length ( ) const

This function calculates the length necessary to convert the string to UTF-8.

Returns
The length if converted to UTF-8.

Definition at line 1283 of file string.cpp.

References valid_character().

bool as2js::String::valid ( ) const

This function checks all the characters for validity. This is based on a Unicode piece of code that clearly specifies that a certain number of characters just cannot be used (i.e. this includes UTF-16 surrogates, and any value larger than 0x10FFFF or negative numbers.)

Note that the null character '\0' is considered valid and part of the string, however, anything after that character is ignored.

Todo:
We are actually transforming the String object to properly check all of its characters as added to the buffer so this function should become obsolete at some point.
Returns
true if the string is considered valid.
See also
valid_character()

Definition at line 986 of file string.cpp.

References valid_character().

bool as2js::String::valid_character ( as_char_t  c) static

The UTF-32 type is limited in the code points that can be used. This function returns true if the code point of c is considered valid.

Characters in UTF-32 must be defined between 0 and 0x10FFFF inclusive, except for code points 0xD800 to 0xDFFF which are used as surrogate for UTF-16 encoding.

Parameters
[in]cThe character to be checked.
Returns
true if c is considered valid.
See also
valid()

Definition at line 1015 of file string.cpp.

Referenced by from_as_char(), from_utf8(), as2js::DecodingFilterUTF8::get_char(), as2js::DecodingFilterUTF32LE::get_char(), as2js::DecodingFilterUTF32BE::get_char(), operator+=(), to_utf8(), utf8_length(), and valid().

Friends And Related Function Documentation
bool operator!= ( char const *  str, String const &  string  ) friend

This function compares an ISO-8859-1 string against a String. If you have a UTF-8 string, make sure to use from_utf8() first and then compare the two String's against each other.

Parameters
[in]strThe string to compare as ISO-8859-1. [in]stringThe String to compare with.
Returns
true if both strings are not equal.

Definition at line 960 of file string.cpp.

bool operator== ( char const *  str, String const &  string  ) friend

This function compares an ISO-8859-1 string against a String. If you have a UTF-8 string, make sure to use from_utf8() first and then compare the two String's against each other.

Parameters
[in]strThe string to compare as ISO-8859-1. [in]stringThe String to compare with.
Returns
true if both strings are equal.

Definition at line 925 of file string.cpp.

Member Data Documentation
as_char_t const as2js::String::STRING_BOM = 0xFEFF static

Definition at line 56 of file string.h.

Referenced by as2js::JSON::output().

as_char_t const as2js::String::STRING_CONTINUATION = -2 static

The documentation for this class was generated from the following files:
  • /home/snapwebsites/snapcpp/contrib/as2js/include/as2js/string.h
  • /home/snapwebsites/snapcpp/contrib/as2js/lib/string.cpp

This document is part of the Snap! Websites Project.

Copyright by Made to Order Software Corp.

Syndicate content

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly