Character set

Character set

Published by: Nuru

Published date: 21 Jun 2021

Character set Photo

Character set

The set of characters that are used to form words, numbers, and expressions in C is called a character set of C. The combination of these characters forms words, numbers, and expressions. Characters and tokens of C are vital elements of C.

The C character set is grouped into the following four categories:
1) Letters or alphabets
2) Digits
3) Special Characters
4) White Spaces

Letters

  1. Uppercase: A……Z
  2. Lowercase: a……z

Digits

All decimal digits: 0……9

Special Characters

 

1) , comma 2) . period 3) ; semicolon 4) : colon
5) ? question mark 6) ‘ apostrophe  7) “ quotation mark  8) ! exclamation mark
9) | vertical bar 10) / slash 11) \ backslash 12) ~ tilde
13) _ underscore 14) $ dollar sign 15) % percent sign 16) & ampersand
17) ^ caret 18) *asterisk 19) -minus sign  20) + plussign
21) < opening angle bracket (or less than sign)  22) >closing angle bracket (or greater than sign) 23) ( left parenthesis 24) ) right parenthesis
 25) [ left bracket 26) ] right bracket 27) { left-brace 28) } right brace
29) # number sign      

White Spaces

1. Blank space
2. Horizontal tab
3. Carriage return
4. Newline
5. Form feed

C tokens

1. In a passage of text, individual words and punctuation marks are called tokens. Similarly, in a C program, the smallest individual units are known as tokens of C.

2. The basic elements recognized by the C compiler are known as “C tokens”.

3. E.g. of C tokens are: keywords (e.g. float, while), identifiers (e.g. num, sum), constants (e.g. 15.5, 100), string literals (e.g. “ABC”, “year”), operators (e.g. +, ,*,/) and special symbols (e.g. [], {}, (), ,).

Every C word is classified as either a keyword or an identifier.

Keywords are predefined words in C programming language.
• All keywords have fixed meaning and these meanings cannot be changed.
• Keywords serve as basic building blocks for program statements.
• Keywords are also called reserved words because they are used for pre-defined purposes and cannot be used as identifiers.

There are generally 32 keywords:

1) auto 2) break 3) case 4) char
5) const 6) continue 7) default 8) do
9) double 10) else 11) enum 12) extern
13) float 14) for 15) goto 16) if
17) int 18) long 19) register 20) return
21) short 22) signed 23) sizeof 24) static
25) struct 26) switch 27) typedef 28) union
29) unsigned 30) void 31) volatile 32) while

note: keywords are written in lower case

Identifiers

• Every word used in the C program to refer to the names of variables, functions, arrays, pointers and symbolic constants are called identifiers.
• These are user-defined names and consist of a sequence of letters and digits, with a letter as the first character.
• Both uppercase and lowercase letters can be used, although lowercase letters are commonly preferred.
• The underscore character can also be used to link between two words in long identifiers.

Rules for identifiers

1) The first character must be an alphabet (or underscore).
2) Must consist of only letters, digits, or underscore.
3) It must not contain white space. The only underscore is permitted.
4) Keywords cannot be used.
5) Only the first 31 characters are significant.
6) It is case sensitive, i.e. uppercase and lowercase letters are not interchangeable.