Description

Pasting the log

Error Message

expected:<2> but was:<1>
Stacktrace

junit.framework.AssertionFailedError: expected:<2> but was:<1>
at junit.framework.Assert.fail(Assert.java:50)
at junit.framework.Assert.failNotEquals(Assert.java:287)
at junit.framework.Assert.assertEquals(Assert.java:67)
at junit.framework.Assert.assertEquals(Assert.java:199)
at junit.framework.Assert.assertEquals(Assert.java:205)
at org.apache.hadoop.security.ssl.TestReloadingX509TrustManager.testReload(TestReloadingX509TrustManager.java:112)
Standard Output

2014-07-06 06:12:21,170 WARN ssl.ReloadingX509TrustManager (ReloadingX509TrustManager.java:run(197)) - Could not load truststore (keep using existing one) : java.io.EOFException
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:628)
at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:38)
at java.security.KeyStore.load(KeyStore.java:1185)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:166)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)
at java.lang.Thread.run(Thread.java:662)

Jitendra Nath Pandey
added a comment - 08/Aug/16 18:12 I have committed this to trunk, branch-2 and branch-2.8. Thanks Ratandeep for an earlier patch, and thanks to Mingliang Liu for taking it to completion.

Jitendra Nath Pandey
added a comment - 05/Aug/16 18:44 I think checking for log non-empty can be error prone, if another log line is added in the thread. Also, if log output is cleared after the last reload, the test will fail.

Per-offline discussion with Jitendra Nath Pandey, we don't have to change the trust store file modification time. The 1s sleep interval is good enough to make sure after the new store the file has a different modification time from the 1st (initial) one.

Basically there are two problems here in the code. 1) The modification time has a precision which can be 1 second (os- and jdk-dependent). 2) even after the TS file modification time changes, we have to wait the reloader thread to reload the contents before asserting newly added records.

Plus, there are three similar places in the test testReloadCorruptTrustStore and testReloadMissingTrustStore that we should address them in this patch. The problem is different in that other two test methods should wait for the reloader thread to reload at least once before assertion.

Mingliang Liu
added a comment - 05/Aug/16 00:58 Per-offline discussion with Jitendra Nath Pandey , we don't have to change the trust store file modification time. The 1s sleep interval is good enough to make sure after the new store the file has a different modification time from the 1st (initial) one.
Basically there are two problems here in the code. 1) The modification time has a precision which can be 1 second (os- and jdk-dependent). 2) even after the TS file modification time changes, we have to wait the reloader thread to reload the contents before asserting newly added records.
Plus, there are three similar places in the test testReloadCorruptTrustStore and testReloadMissingTrustStore that we should address them in this patch. The problem is different in that other two test methods should wait for the reloader thread to reload at least once before assertion.
Attached v2 patch.

I think increasing the sleep time is not ideal. According to the Über-jira proposal, we should not use fixed sleep interval to wait for conditions. To address this, one approach is to coordinate the test thread and the reloader thread which, however, needs intrusive changes in the production code. A better solution is to trigger the configuration reload manually in test. We can achieve this by changing the modification time of the trust store file. After that, we can use GenericTestUtils.waitFor() to retry the assertion. I attached the v1 patch for addressing this.

Mingliang Liu
added a comment - 04/Aug/16 23:27 This test is flaky and fails recently, e.g. HADOOP-13444 and HADOOP-13297 .
I think increasing the sleep time is not ideal. According to the Über-jira proposal , we should not use fixed sleep interval to wait for conditions. To address this, one approach is to coordinate the test thread and the reloader thread which, however, needs intrusive changes in the production code. A better solution is to trigger the configuration reload manually in test. We can achieve this by changing the modification time of the trust store file. After that, we can use GenericTestUtils.waitFor() to retry the assertion. I attached the v1 patch for addressing this.
Ratandeep Ratti , thoughts? Ping Arpit Agarwal and Jitendra Nath Pandey for more input. Thanks.

It seems that the trust-store file created with 2 issuers is not loaded in time for the test-case to verify. The loading of the trust-store file happens in different thread by the TrustStoreManager. A simple fix is to wait for a longer time. Currently it is 0.2 secs

Ratandeep Ratti
added a comment - 15/Jul/14 07:21 This test is sporadically failing on our test machines
It seems that the trust-store file created with 2 issuers is not loaded in time for the test-case to verify. The loading of the trust-store file happens in different thread by the TrustStoreManager. A simple fix is to wait for a longer time. Currently it is 0.2 secs