I found an article explaining the problem with null values (as mentioned in Alexi's current post). Here is the link (just search for ShipDateNull to go directly to the part explaining how to use the "indicator" field as it's called): http://www.itjungle.com/mpo/mpo082803-story02.html
Announcement
Collapse
No announcement yet.
Null Values and SQL Fetch
Collapse
X
-
Re: Null Values and SQL Fetch
The Article:
The NULL Nemesis
by Michael Sansoterra
After writing "The Valuable NULL," which enumerated the benefits and
snares of using NULLs in an SQL environment, I received an interesting
response from a reader. He stated that NULLs were more trouble than
they're worth because they usually require extra logic to process correctly.
Though it's true that SQL programmers have more to think about when
dealing with NULL-capable fields, the benefits of NULLs are worth
the extra effort, provided one understands how NULLs behave in
expression and conditional operations. (Before I understood these
features, NULLs were a great source of grief!) But the reader's
statement about NULLs requiring special logic and being an all
around pain is certainly true when using NULL-capable fields
with high-level languages like C, RPG, or COBOL.
The reason for this difficulty in dealing with NULLs is that
high-level languages do not recognize NULL values
(most older HLLs were developed before databases allowed for NULLs).
Here's an illustration of the problem. Say a database table contains a
field called CONTACTNAM, defined as CHAR(30), which allows NULL values.
A HLL program reads this database table and encounters a record where the
field CONTACTNAM is NULL. Since variables in an HLL program don't know what
a NULL is, what value is placed in the variable? In this case, the answer
is a blank. In general, when a database field contains a NULL, the HLL
program will contain the field's "low value." For example, NULL character
fields will contain a blank, NULL numerics will contain a zero, and NULL
dates will contain the lowest possible date value for the specified date
format (for example, 0001-01-01 for a date defined with the *ISO date format).
Technically, these low-value assignments are incorrect, because, as described
my last article, a NULL represents a missing or unknown value. If a numeric
field is NULL, which is different from a zero, how can the HLL program know
the difference? To allow HLLs to recognize NULL values, a special indicator
is used to designate that the variable is NULL. These special "indicators"
take the form of variables or specialized built-in functions. It is the
responsibility of the HLL programmer to go to the extra effort of checking
these NULL indicators to distinguish if the value in the variable is actually
the default low value or a NULL. This article focuses on using these
indicators in a high-level language to correctly work with NULLs.
Defining NULL-Capable Fields
Here's 'a quick review of how fields in database files are defined to hold
a NULL. Shown below are the equivalent SQL and DDS statements needed to
create a simple "orders" file. The only field allowed to contain a NULL in
this file is ShipDate:
Code:SQL CREATE TABLE Orders ( OrderID INTEGER NOT NULL, CustomerID CHAR(5) NOT NULL, OrderDate DATE NOT NULL, ShipDate DATE, Unique (OrderID)) DDS A UNIQUE A R ORDERREC A ORDERID 9B 0 A CUSTOMERID 5A A ORDERDATE L DATFMT(*ISO) A SHIPDATE L DATFMT(*ISO) A ALWNULL A K ORDERID
In SQL, fields are assumed to be NULL-capable
(meaning they are allowed to contain a NULL) unless the NOT
NULL keywords are specified. However, fields defined in DDS
are assumed not to be NULL-capable unless the ALWNULL keyword
is specified.
When defining files, only allow fields to contain NULLs where
it makes sense to have NULLs. Don't unnecessarily create
NULL-capable fields, because each time a field is NULL-capable
extra logic may be required. In general, if the program has the
ability to determine the value for a particular field, then the
field should not be NULL-capable. For example, a field such as
order date should not be NULL-capable, because the program will
always know the date the order was taken. However, a ship date
field can be "NULL-capable", because the program may not know
the actual ship date at the time the order is taken and the
program shouldn't shove an arbitrary date value in the field
to represent this "unknown."
Now that you know how NULLs are defined, take a look at
how they're accessed, with languages such as RPG.
RPG/400 and NULLs
RPG/400 doesn't support processing NULLs in a database file. If a
file contains NULLs, specifying the ALWNULL(*YES) compiler option
on the Create RPG Program (CRTRPGPGM) command allows the program
to access the file as input only, with the caveat that all
NULL-capable fields contain the "default" value when a NULL
is encountered. This means the RPG/400 program will have no
way of distinguishing a NULL from a blank, for example.
Only use the ALWNULL(*YES) compiler option if the program
will process a field's default value the same as it would
if it were a NULL. To be able to work with "nullable" database
fields in an RPG/400 program, you must use embedded SQL with
indicator variables (more on this later).
RPG IV and NULLs
Fortunately, RPG IV has NULL support for database fields that are
NULL-capable. The %NULLIND built-in function is used to test if a field
is NULL, or to set a field to NULL or not NULL. The %NULLIND BIF only
works with NULL-capable database fields; you can't define a stand-alone
field on the D-spec and use %NULLIND to set it to NULL. Program-described
files can not contain NULL-capable fields. As with RPG/400, to process
files with NULL-capable fields, the ALWNULL keyword must be specified
at compile time or as an H-spec keyword. RPG IV's version of ALWNULL
accepts one of the following values:
NO--Processing files containing NULL-capable fields is not allowed (default).
INPUTONLY--Files containing NULL-capable fields can be used in a read-only
capacity.
USRCTL--Files with NULL-capable fields can participate in read and
modification operations (like write, update, or delete), as well as in
keyed operations.
To illustrate, assume we'll be working with the "orders" table defined
above, and it contains a NULL-capable field called ShipDate.
We need to write an order-inquiry program that does something
special if ShipDate contains a NULL
(that is, if the order hasn't been shipped).
Consider the following code excerpt that demonstrates how %NullInd
is used to test whether the field ShipDate is NULL:
Code:H ALWNULL(*UsrCtl) FOrders IF E K Disk . . . C/Free ExFmt GetOrderScreen; Chain OrderID Orders; If %Found; If %NullInd(ShipDate)=*On; // Order hasn't been shipped EndIf; Else; // Order Not Found EndIf; ...
As seen in the above example, %NULLIND returns an indicator (Boolean)
value of *ON or *OFF to indicate the presence of a NULL in a
field. Additionally, when used on the left side of an expression,
%NULLIND can be used to manipulate whether a field contains a NULL:
Code:// Set ShipDate to NULL Eval %NullInd(ShipDate)=*On; // Set ShipDate to a value Eval %NullInd(ShipDate)=*Off; Eval ShipDate=d'2003-08-01';
For clarification, don't attempt to use RPG's *NULL special value
to set database fields to NULL, because *NULL is reserved for
working with pointers. Also note that placing a value in a field
with %NULLIND set to *ON is pointless:
Code:// Set ShipDate to NULL Eval %NullInd(ShipDate)=*On; // If this file were // opened for update, // this statement has // no effect. Eval ShipDate=d'2003-08-01'; Update OrdersRec;
In the last example, the second EVAL statement (ShipDate=d'2003-08-01')
is useless, as far as the database manager is concerned. If the
file is updated when %NULLIND is on, any date value placed in
ShipDate is ignored. However, as far as normal RPG processing is
concerned, if ShipDate is moved to another field, the date value has
meaning, because %NULLIND is only applicable, in the final outcome,
to the database manager. Because of this behavior, it is important to
implement special logic to check the %NULLIND before assigning or reading
values from NULL-capable fields, because RPG itself doesn't pay any attention
to %NullInd when doing its normal MOVE, MOVEL, and EVAL operations.
The ILE RPG Reference guide has a section entitled
"Database Null Value Support," which details a plethora of implications
related to this behavior.
Keyed Operations
Keyed operations (like SETLL, READE, and CHAIN) involving one or more
NULL-capable key fields are simple, as long as your key list is defined
correctly. In this next example, the orders file has an index called
OrdersShip built over it by ShipDate and CustomerID:
Code:CREATE INDEX OrdersShip ON Orders (ShipDate, CustomerID)
shipped (that is, the ShipDate contains a NULL).
This next section of code demonstrates how to search for NULL values in a key field:
Code:H ALWNULL(*USRCTL) FOrdersShipIF E K DISK DCount S 7 0 DShipDateNull S 1N C Key_ShipDate KList C KFld ShipDateNull ShipDate C/Free // Read all records where // the shipdate is NULL // (When the NULL indicator is on // the actual content of the ShipDate // field is ignored.) ShipDateNull=*On; SetLL Key_ShipDate OrdersShip; ReadE Key_ShipDate OrdersShip; Dow Not %EOF; Count=Count+1; // Process Records with a shipdate of NULL ReadE Key_ShipDate OrdersShip; EndDo; *InLR=*On; Return; /End-Free
Notice that Factor 2, which is normally left blank for the KFLD op code,
has an indicator type variable called ShipDateNull specified. This indicator
variable controls whether the RPG program should look for NULL values.
In this case, since we're searching for records with NULL ship dates,
the ShipDateNull indicator variable is turned *ON. To position to a non-NULL
value, turn the indicator *Off and set the ShipDate field appropriately.
If Factor 2 is omitted from the KFLD definition on a NULL-capable field,
the program will have no way to look specifically for a NULL value.
OS/400 V5R2 users will be disappointed to learn that the new %KDS built-in
function doesn't support NULL positions using an indicator, so for now the
traditional KFLD op code must still be used.
When doing keyed positions and reads, remember that, on the iSeries,
NULLs are at the bottom of the sort order. If the NULL indicator in
Factor 2 is *On, the database manager will ignore any values specified
in the ShipDate field and skip right to the NULLs in the file,
if any are present. If there are no NULLs to read, the End Of File
(EOF) indicator will be turned on, because NULLs come last.
High-Level Languages, Embedded SQL, and NULLs
iSeries shops that have the "DB2 Query Manager and SQL Development Kit"
installed (product 5722ST1) can use embedded SQL to access NULL-capable
fields from inside an HLL program. Embedded SQL is too large a topic to
be covered here,. but if you need help getting up to speed with embedded
SQL review "Empower Users with Embedded SQL" or
"Dynamic Selection with Embedded SQL."
Here is an RPG-embedded SQL program snippet that
retrieves the ShipDate field from the orders file
for a given OrderID:
Code:D RPGShipDate S D D RPGOrderID S 10I 0 C Eval RPGOrderID=11074 C/EXEC SQL C+ SELECT ShipDate C+ INTO :RPGShipDate C+ FROM Orders C+ WHERE OrderID=:RPGOrderID C/END-EXEC C Select C When SQLCOD=*Zero â?¦ OK C When SQLCOD=100 â?¦ Data not found C Other â?¦ Error/Warning given C EndSl
In case you're unfamiliar with embedded SQL, here's a quick review.
SQL statements are placed between compiler directives
/EXEC SQL and /END-EXEC. The plus sign (+) in position 7 is a
continuation marker, indicating that the SQL statement spans more
than one line.
The SQLCOD (SQL code) variable is used by the RPG program to test
the success of an SQL statement. SQLCOD and many other SQL related
"status" fields are added automatically to the program by the SQL
precompiler.
An SQL code of zero means the statement ran okay. An SQL code of 100 means
the statement didn't return or process data. An SQL code with a negative
value indicates an error condition, while an SQL code with a positive value
indicates a warning.
The identifiers prefixed with a colon ( : ) are not field names from the database
file but are actually RPG variable names. I've prefixed the RPG variable
names with the letters RPG to minimize confusion between database fields and
RPG variables. When the above SQL statement runs, SQL will look in the file
for the OrderID that matches the value contained in variable RPGOrderID.
When it finds the record, it will return the ShipDate field value into the
RPG variable RPGShipDate. In SQL terminology, these variables are
called "host variables", because they are from the "hosting"
high-level-language program.
The SELECT INTO statement is similar to a CHAIN, and is usually used when
there is only one record to be retrieved.
The concept is simple enough. The ShipDate for the given OrderID is placed
in program variable RPGShipDate. But what if ShipDate contains a NULL
in the file? The way the program is coded now, when the statement
retrieves a NULL the SQLCOD will contain error code -305, which
has the primary error description of "indicator variable required."
This message indicates that SQL can't return the NULL value into a
HLL program variable without the assistance of an "indicator"
variable. An indicator variable is simply an extra variable,
defined with an SQL data type of SMALLINT, containing either
a zero or a negative 1, to indicate if the given variable is
NOT NULL or NULL, respectively. Here's the revised code using
an indicator variable that will behave correctly if a NULL is read:
Code:D ShipDateNull S 5I 0 D NULL S 5I 0 Inz(-1) C Eval OrderID=11074 C/EXEC SQL C+ SELECT ShipDate C+ INTO :RPGShipDate:ShipDateNull C+ FROM Orders C+ WHERE OrderID=:OrderID C/END-EXEC C Select C When SQLCOD=*Zero C If ShipDateNull=NULL â?¦ ShipDate is NULL â?? Process accordingly C Else â?¦ ShipDate is NOT NULL â?? Process accordingly C EndIf
RPG variable is appended immediately after the variable name. As
with the %NULLIND built-in function, NULL indicator variables demand
that an extra step be taken to check if a variable is NULL, rather
than a blank or a zero. When using embedded SQL, every potential
field that can have a NULL should have an accompanying NULL indicator
variable. If knowing the state of the NULL is unimportant to the HLL
program, the COALESCE or IFNULL SQL functions can be used, to negate
the need for an indicator variable by guaranteeing that the column
will never be NULL.
It has already been noted that NULL indicator variables are supposed
to be defined as the equivalent of the SQL SMALLINT data type
(a two-byte binary field). In RPG IV, this can be coded by defining a
field as "5I 0" on the D-specs. To determine how to define the SMALLINT
data type in another language, see the DB2 Universal Database for iSeries
SQL Programming with Host Languages guide (PDF format).
This guide covers many facets of embedded SQL for each
particular high-level language and includes charts on how to
define a given HLL data type to match a corresponding SQL data type.
The Nagging Nulls
Using NULLs with high-level languages can be a bit of a chore. But as
more applications use NULL-capable fields, remembering some simple
rules about NULL indicator variables will allow high-level-language
programs and NULLs to coexist peacefully.
Michael Sansoterra is a programmer/analyst for SilverLake Resources,
an IT services firm based in Grand Rapids, Michigan. E-mail:
msansoterra@silver-lake.comAll my answers were extracted from the "Big Dummy's Guide to the As400"
and I take no responsibility for any of them.
www.code400.com
Comment