In this paper, an efficient architecture of an FPGA-based bitmap index creation is proposed. The design utilizes a content addressable memory together with a bit-level transpose matrix to index multi-record documents by several given keywords. The experiments in a Cyclone V SX FPGA proved that our circuit could attain throughput of 330.6 million records per second while only using around 71% of embedded memory together with 45% of lookup tables and registers. In fact, achieved throughput is 2.8 times and 1.7 times as high as that of CPU-based and GPU-based design, respectively.