I was using Pentaho Data Integration even before Pentaho bought it and call it that.
I have the last free version.
I went on their website recently to see if they had released another version only to ...

Question:
I have a script with around 45 thousand insert from select statements. When I try and run it, I get an error message stating that I have run out of memory. How can I get this script to run?
...

I'm using PostgreSQL 9.1 on Ubuntu. Are scheduled VACUUM ANALYZE still recommended, or is autovacuum enough to take care of all needs?
If the answer is "it depends", then:
I have a largish database ...

I realized that my company uses an ELT (extract-load-transform) process instead of using an ETL (extract-transform-load) process.
What are the differences in the two approaches and in which situations ...

Based on my analysis, a complete dimensional model for our data warehouse will require extraction from over 200 source tables. Some of these tables will be extracted as part of an incremental load and ...

For performance reasons in some scenarios, e.g. Amazon EC2, you have access to a faster and cheaper storage device, which loses all its data on reboots, so it is called "ephemeral".
This question is ...

In our current prod environment, when the ETL launches, it first checks an environmental variable at the OS level which tells it where the "config" database and table is for it to get the necessary ...

I am dealing with a rather massive ETL processing which keeps throwing me the ORA-12549 error.
Have been looking for problems in the database so far and nothing. The DBA says the database is "fine" - ...

I have 2 type of tables to populate the data-warehouse with every day, lookup tables or configuration tables with few 100s records, and thats easy where i just truncate and refill the table.
but for ...

What book or webpage can I read the architecture and methodology for creating ETL process?
In other words, I'm looking for "how to do it" to create a ETL process when you have many source system to ...

My apologies if this has already been answered. I've searched SO and of course here on DBA, and am surprised to find no close matches. Specifically, I'm looking for a solution which survives schema ...

Suppose I have a "main" table that was probably heavily normalized and consists largely of columns that merely contain codes that are lookups into other tables (they are probably foreign keys but feel ...

This question originates more from an "enterprise architecture" point of view. (Reason being, our company is just starting to get into data management, and actually having DBA's -- I know, horrific ...

I have a 47 GB MySQL dump of a single table:
http://dumps.wikimedia.org/commonswiki/latest/commonswiki-latest-image.sql.gz
I ultimately want it into PostgreSQL, but since I didn't figure out an easy ...

I'm trying to set up a regular import of an excel spreadsheet that we get from a vendor. I'm using SQL 2008 R2 SSIS to import it into a table. The problem connection manager is an OLE DB connection ...

I know that those letters mean Extract, Transform, and Load.
But, when I used it at first, I thought that during the Transform phase I could do plenty of different joins on data that I've extracted ...

We are having an ETL process which inserts lots of data into tables. This database is set to Simple Recovery Model and the transaction log is growing a lot. I was thinking that would it help to set ...

Using stored procedures as part of an ETL process to populate tables in another database, which method would be considered a best practice (if any) in terms of deployment, maintenance, visibility and ...

A process that I don't have control over is dropping and re-creating tables in a MySQL database every night. This wouldn't be a problem (I think), if it re-created the tables identically every time. ...

I have a set of five tables (a highly decomposed schema for an ETL if I understand the nomenclature) that I'm going to load via bulk import, then run some inserts from those five tables into a SLEW of ...

I have an SSIS package targeted at SQL Server 2012.
I have it deployed into the Stored Packages in Integration Services and then have a SQL Server Agent job which executes it.
The first part of the ...

I have a database with 13 billion rows, per day I have around 20-30 mio rows. On top of this I have one cube, one of its dimensions is DateTime that goes down to milliseconds. To load the fact table I ...

I've been given a somewhat vague requirement regarding a data extract process which I need some help with. I'm fairly new to database warehousing etc. but I will try to explain everything relevant.
...

I am looking for a solution giving me automated archiving of SQL Server databases based on a smart interpretation of table relationships that is able to archive records from related tables minimizing ...

In analyzing some of our business data, we're realizing that we need to join data across disparate data sources. For example, our application data is warehoused in Postgres (ported from MongoDB via ...

I understand from various questions [1] [2] that SQL Server Express 2008 R2 doesn't support SSIS packages and that there's only primitive support for ad-hoc data imports / exports.
I have already got ...

I've recently been tasked with exporting data from one of our databases to a central Data Warehouse. The export is supposed to be incremental since a full load every 24 hours would be a massive waste ...