Hey,
it is possible that the setup time is a result of a very big GoP size. The GoP size is defined by how often a keyframe is sent. Often times players will not start any playback until a keyframe is received. You can use ffmpeg to specify your GoP size, it is measured by frame count, so you’ll need to know what your frame rate is.
In regards to rtmp and html5, there is unfortunately no true html5 player that supports rtmp. You will have to use DASH (or in some cases HLS), but that will introduce latency which may or may not be ok for your scenario.
Bryan