Category Archives: apache-drill

Apache Drill: Create Table Error While Selecting from Json Data

Below command is working fine. SELECT TO_TIMESTAMP(ts,'yyyyMMddHHmmss') FROM dfs.tmp`/mapr/my.cluster.com/hive/cpf_sales.json

But when I am trying to create a table from select statement then it is giving an error. Below are the one's I tried

ALTER SESSION SET store.format='json'; use dfs;

CREATE TABLE by_yr (gen_date) AS SELECT TO_TIMESTAMP(ts,'yyyyMMddHHmmss') FROM dfs./mapr/my.cluster.com/hive/cpf_sales.json LIMIT 100;

Error: org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Encountered ";" at line 1, column 8. Was expecting one of: "." ... "[" ... SQL Query use dfs; ^ CREATE TABLE by_yr (gen_date) AS SELECT TO_TIMESTAMP(ts,'yyyyMMddHHmmss') FROM dfs./mapr/my.cluster.com/hive/cpf_sales.json LIMIT 100 [Error Id: 81cbe394-b3c6-4c34-80ad-83325f748ae1 on iot3:31010]

use dfs.tmp;

CREATE TABLE by_yr (gen_date) AS SELECT TO_TIMESTAMP(ts,'yyyyMMddHHmmss') FROM dfs.tmp/mapr/my.cluster.com/hive/cpf_sales.json LIMIT 100;

Error: org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Encountered ";" at line 1, column 12. Was expecting one of: "." ... "[" ... SQL Query use dfs.tmp; ^ SELECT COLUMNS[0], COLUMNS[1] from dfs.tmp./mapr/my.cluster.com/donuts.json [Error Id: 5e9d1d20-a804-4d09-8b69-d76b3c009647 on iot2:31010]

Apache Drill – SQL Server plugin does not ‘Show Tables’

Using Apache Drill, I successfully created new plugin : mssql

Configuration:

{
  type: "jdbc",
  driver: "com.microsoft.sqlserver.jdbc.SQLServerDriver",
  url: "jdbc:sqlserver://99.99.99.999:1433;databaseName=ABC",
  username: "abcuser",
  password: "abcuser",
  enabled: true
} 

But when i try to query again a table I get an error:

select * from mssql.ABC.dbo.TableName

Error:

org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 'mssql.ABC.dbo.TableName' not found SQL Query null [Error Id: feba9fdb-1621-438a-9d7c-304e4252a41f on AA99-9AA9A99.xyz.abc.com:31010]

Even the below command returns no tables:

show tables;

Unable to create a JDBC Storage Plugin for Solr in Apache Drill

I'm unable to create a JDBC Storage Plugin for Solr in Apache Drill. I've added the solr-solrj.jar and the jars from the solrj-libs folder into the jars/3rdparty folder for drill. Here is my Configuration:

{
  "type": "jdbc",
  "driver": "org.apache.solr.client.solrj.io.sql.DriverImpl",
  "url": "jdbc:solr://<host>:<port>/?collection=<collection-name>",
  "username": "",
  "password": "",
  "enabled": true
}

I have put the actual host, port and collection-name values in the above configuration.

Once I press Create I get the following message: Please retry: error (unable to create/ update storage)

Anyone know if Apache Drill allows the Solr JDBC drive to be setup and used as a Data Source?

Performance of Apache Ignite vs Apache Drill for SQL

I need to fetch data from some big MySQL tables to be able to show on dashboard/web portal. Mainly, my focus is to improve SQL performance given the size of datasets.

Also, is Apache Ignite less scalable than Apache Drill considering Ignite uses RAM as a primary data source?

Please let me know in case, more detail is needed.

I have been through these links: http://drcos.boudnik.org/2015/04/apache-ignite-vs-apache-spark.html https://mpouttuclarke.wordpress.com/2016/01/04/why-i-tried-apache-spark-and-moved-on/

Does using optional HDFS layer beneath IGFS slows down the performance of the system to the level of SparkSQL? https://ignite.apache.org/features/igfs.html

Unable to create the storage plugin for hive in Apache drill

I am new to Apache drill.While creating the storage plugin for Apache hive.I am getting the error.I have tried two ways.Below is the configuration.

1.First approach:

        {
          "type": "hive",
          "enabled": false,
          "configProps": {
        "hive.metastore.uris": "thrift2:localhost:10000",
        "fs.default.name": "hdfs://localhost:9000/",
        "hive.metastore.sasl.enabled": "false"
          }
        }

2.Second approach:

        {
          "type": "hive",
          "enabled": false,
          "configProps": {
        "hive.metastore.uris": "",
        "javax.jdo.option.ConnectionURL": "jdbc:derby://localhost:1527/metastore_db;create=true",
        "hive.metastore.warehouse.dir": "/user/tmp/warehouse/hive",
        "fs.default.name": "hdfs://localhost:9000",
        "hive.metastore.sasl.enabled": "false"
          }
        }

I am using plain Apache components and both drill and hive2 are installed in the same machine.

For both the cases I am getting the error in the GUI as

Please retry: error (unable to create/ update storage)

Kindly help me in resolving the same.Thanks in Advance!!

Apache Drill JDBC Java Client Exception

I am new to Apache Drill and got it setup fine to run locally in embedded mode and via Web interface. However, am facing the following issue when trying to access via Java client using JDBC.

Following drill docs and a few posts here, my setup is like:

<dependency>
    <groupId>org.apache.drill.exec</groupId>
    <artifactId>drill-jdbc</artifactId>
    <version>1.7.0</version>
</dependency>

code:

public static void main(String[] args) {
  Class.forName("org.apache.drill.jdbc.Driver");
  **Connection connection = DriverManager.getConnection("jdbc:drill:zk=local");**
  Statement st = connection.createStatement();
  ResultSet rs = st.executeQuery("SELECT * from cp.`employee` LIMIT 10");
  while (rs.next()) {
    System.out.println(rs.getString(1));
  }
...

There are no compile issues however, on running the above, I get the following OutOfMemoryException on the highlighted section of above code:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/drill/exec/exception/OutOfMemoryException
 at org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
 at org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
 at net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
 at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
 at java.sql.DriverManager.getConnection(DriverManager.java:664)
 at java.sql.DriverManager.getConnection(DriverManager.java:270)
 at com.mapr.drill.DrillJDBCExample.runMode1(DrillJDBCExample.java:49)
 at com.mapr.drill.DrillJDBCExample.main(DrillJDBCExample.java:21)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)
 at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: java.lang.ClassNotFoundException: org.apache.drill.exec.exception.OutOfMemoryException
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 13 more

I did this with drill running locally. Also tried with changing jdbc url to "jdbc:drill:drillbit=localhost";

Please help.

Get week of the year from date in apache drill

I'm trying to convert a timestamp in some data (using apache drill) to the week of year.

According to the reference the closet thing I can see is the "extract" function however this only works for YEAR, MONTH, DAY, HOUR, MINUTE, SECOND units.

Original date example

ID, START_DATE
1, 2014-07-07T13:20:34.000Z 

Query (something like this)

SELECT ID, EXTRACT(WEEK, START_DATE) FROM MY_TABLE; 

And get a result like this:

ID, START_DATE
1, 27 

Should I use Apache Drill or Apache Solr?

I have a MongoDB database that contains a collection with millions of documents, what I need is query through these documents depending on certain conditions in the columns.. I read intensely the documentation of Apache Drill and since it allows writing queries in SQL it interests me but I'm afraid about the performance. My query would be:

select * from table where col1= cond1 and col2= cond2 ... and col12=cond12;

Only two conditions are mandatory but at most 12 conditions on different 'columns' Is Solr more suited for this ? or does Apache Drill suitable for the job ? I really looked everywhere and the documentation of Apache Drill is more rich. Thank you