Hello, I need to build a new system that will have a table that will keep data for 60 days (aprox. 1 billion rows). Once the data is there, more data will be loaded (aprox. 10M rows/day loading around 40 files with 40k rows each file).
The system needs to reject and report this new data whenever a duplicate value is found (90% of data is suppose to be valid; the other 10% is suppose to be duplicated). I also need to maintain this table with the latest 60 days of data.
What approach do you recommend?
I was planning to have a partitioned table per day (is it possible to have 60 partitions?), use a PK on those unique values, use external table (or SQL*Loader) to load the data and use the HINT APPEND to insert new data. Then use another process to drop old partitions and create a new one for today’s data.
I’m not very familiar with partitioned tables, so any example and help is appreciated.
What do you think about this approach?
Any other ideas?
Thanks,
Diego