• LinkedIn

  • Follow via Facebook

  • Follow via Twitter

  • Submit RFP

  • Contact Us

How to Handle Null Values using Talend

Posted by BDD Talend Practice
Category:

nullValues

Null Handling

There’s nothing more frustrating that having to look at 1000s of lines of Java code, to find the cause of a Null pointer Exception

If you know what you’re doing, it’s usually not too much of a problem; A little understanding of Java Classes and Exceptions  goes a long way to helping. This is as much a Java lesson than it is Talend.

Literals

Generally speaking, Literals are to be avoided; that is, a value has been hard-coded in to your source code. In the following code-fragment, “Hello World!” is a Literal.

String message = "Hello World!"

Primitive Type Variables

Variables can represent data using Java Primitive Types.

Class Instance Variables

A Class Instance Variable is a Pointers to an instance of an object, whose type is class, for example, String.

In the following example, the Class Instance Variable mYString is a pointers to an object of the type String.

String mYString;

Primitive Type Variable or Class Instance Variable?

Certain data types may be stored as either Primitive types or objects. Have you ever noticed in a Talend schema that integer data types, for example, are shown as int | Integer. Why is this?

An integer can be stored as either a primitive type (int) or an object (Integer). From an efficiency stand point, data is best stored as int, however, one important value that cannot be represented by int is null; if you assign no value, then the value of int is zero.

If you define a column as Nullable in your schema, then an Integer type is used i.e. you need to be able to store a null value; otherwise, an int will be used.

Handling null Class Instance Variables (Objects)

When we’re talking about null-handling in Talend (java); we are specifically talking about handling Class Instance Variables (Objects) that are Null pointers.

If you attempt to call a Method of Class Instance Variable that is a Null-pointers,Java will throw a Null pointer exception.

Testing for null Pointers

It’s simple to test for a null pointers, as shown below: –

if(myString == null) System.out.println("myString is null");

Talend provides the Routines (routines.Relational) public static boolean ISNULL(Object variable); which makes the same == test, as that shown above.

It is good practice to always test for a Null-Pointers, before using an object.

In these two, albeit unrealistic, examples, the first will throw a Null Pointer Exception.

Example #1 – Bad

String myString = null;
if(myString.length() > 0) System.out.println(myString.toUpperCase());

Example #2 – Good

String myString = null;
if(myString != null && myString.length() > 0) System.out.println(myString.toUpperCase());

Other Considerations for String Instance Variables

Two other String values that you may want to give consideration to are: –

  • “” (a zero-length string)
  • “null” (the string “null”)

Zero-length String

A zero-length String is something you will often want to handle, often converting it to a null pointer. Take an example where you are reading a csv file and writing it to a MySQL DB.

When a string input from your csv file is zero-length, it is sensible to store this as a NULL value in your database. The following Mapping Expression shows how you might handle this.

row1.MyString == null ? null : row1.MyString.length() == 0 ? null : row1.MyString;

You may, of course, want to wrap-up the above statement in to a Routine.

“null” String (Talend Context)

When you create a new context variable , Talend assigns a Default value of null, as shown in the screenshot below.

Image 1

At first glance, you may think that Talend is doing this to show that the Context Variable  is, in fact, a Null Pointer; however, this is not the case. Talend has actually assigned "null" as a value. If you remove this default value, the Context Variable  will then become a zero-length String. To my mind, this is an unhelpful bug.

The other interesting observation with a Context Variable, is that when assigning values in the Contexts Tab, double-quotingis optional, as demonstrated below.

Image 2

// tJava_1 Code
System.out.println(context.new1);
System.out.println(context.new2);

Image 3

 

 

ABOUT BIG DATA DIMENSION

BigData Dimension is a leading provider of cloud and on-premise solutions for BigData Lake Analytics, Cloud Data Lake Analytics, Talend Custom Solution, Data Replication, Data Quality, Master Data Management (MDM), Business Analytics, and custom mobile, application, and web solutions. BigData Dimension equips organizations with cutting edge technology and analytics capabilities, all integrated by our market- leading professionals. Through our Data Analytics expertise, we enable our customers to see the right information to make the decisions they need to make on a daily basis. We excel in out-of-the-box thinking to answer your toughest business challenges.

Talend Unified Solution

You’ve already invested in Talend project or maybe you already have a Talend Solution implemented, but may not be utilizing the full power of the solution. To get the full value of the product, you need to get the solution implemented from industry experts.

At BigData Dimension, we have experience spanning over a decade integrating technologies around Data Analytics. As far as Talend goes, we’re one of the few best-of-breed Talend-focused systems integrators in the entire world. So when it comes to your Talend deployment and getting the most out of it, we’re here for you with unmatched expertise.

Our work covers many different industries including Healthcare, Travel, Education, Telecommunications, Retail, Finance, and Human Resources.

We offer flexible delivery models to meet your needs and budget, including onshore and offshore resources. We can deploy and scale our talented experts within two weeks.

GETTING STARTED

  • Full requirements analysis of your infrastructure
  • Implementation, deployment, training, and ongoing services both cloud-based and/or on-premise

MEETING YOUR VARIOUS NEEDS

    • BigData Management by Talend: Leverage Talend Big Data and its built-in extensions for NoSQL, Hadoop, and MapReduce. This can be done either on-premise or in the cloud to meet your requirements around Data Quality, Data Integration, and Data Mastery
    • Cloud Integration and Data Replication: We specialize in integrating and replicating data into Redshift, Azure, Vertica, and other data warehousing technologies through customized revolutionary products and processes.
    • ETL / Data Integration and Conversion: Ask us about our groundbreaking product for ETL-DW! Our experience and custom products we’ve built for ETL-DI through Talend will give you a new level of speed and scalability
    • Data Quality by Talend: From mapping, profiling, and establishing data quality rules, we’ll help you get the right support mechanisms setup for your enterprise
    • Integrate Your Applications: Talend Enterprise Service Bus can be leveraged for your enterprise’s data integration strategy, allowing you to tie together many different data-related technologies, and get them to all talk and work together
    • Master Data Management by Talend: We provide end-to-end capabilities and experience to master your data through architecting and deploying Talend MDM. We tailor the deployment to drive the best result for your specific industry – Retail, Financial, Healthcare, Insurance, Technology, Travel, Telecommunications, and others
    • Business Process Management: Our expertise in Talend Open Studio will lead the way for your organization’s overall BPM strategy

WHAT WE DO

As a leading Systems Integrator with years of expertise in the latest and greatest integrating numerous IT technologies, we help you work smarter, not harder, and at a better Total Cost of Ownership. Our resources are based throughout the United States and around the world. We have subject matter expertise in numerous industries and solving IT and business challenges.

We blend all types of data and transform it into meaningful insights by creating high performance Big Data Lakes, MDM, BI, Cloud, and Mobility Solutions.

What We Do

OUR CLOUD DATA LAKE SOLUTION

CloudCDC Data Replication

CloudCDC is equipped with the most intuitive and user friendly interface. With in a couple of clicks, you can load, transfer and replicate data to any platforms without any hassle. Do not worry about codes or scripts.

FEATURES

• Build Data Lake on AWS, Azure and Hadoop

• Continuous Real Time Data Sync.

• Click-to-replicate user interface.

• Automated Integration & Data Type Mapping.

• Automated Schema Build.

• Codeless Development Environment.

OUR SOLUTION ENHANCES DATA MANAGEMENT ACROSS INDUSTRIES

Enhances Data Across Industries

CONTACT THE EXPERTS AT BIGDATA DIMENSION FOR YOUR CLOUDCDC, TALEND, DATA ANALYTICS, AND BIG DATA NEEDS. CONTACT US TODAY TO LEARN MORE!

Leave a Reply