Проблема при тестировании производительности заполнения Seq против списка в Scala - PullRequest
0 голосов
/ 29 апреля 2019

Я пытался понять больше об обосновании выбора Seq или List в Scala, чтобы помочь этому, я пытаюсь создать простой синхронизированный пример, где я создаю экземпляр каждого из них, оба заполненные одним и тем жеколичество элементов - см. ниже.

object SeqVsList extends App with LazyLogging {

  private val numberOfElements = 1234567

  // whichever of these is run first takes the most amount of time
  populateSeq()
  populateList()

  def populateSeq(): Unit = {
    val seqStartTime = System.currentTimeMillis()
    val aSeq = Seq.fill(numberOfElements)("foo")
    logger.info(s"Populating Seq took ${System.currentTimeMillis() - seqStartTime} ms")
  }

  def populateList(): Unit = {
    val listStartTime = System.currentTimeMillis()
    val aList = List.fill(numberOfElements)("bar")
    logger.info(s"Populating List took ${System.currentTimeMillis() - listStartTime} ms")
  }
}

Проблема, с которой я столкнулся (как определено в моем комментарии в коде), заключается в том, что пример не совсем точно показывает, какой из них быстрее всего заполнить все элементы, вместо этогокакой из методов, которые я вызываю первым, всегда самый медленный.

Я предполагаю, что за кулисами происходит что-то такое, как загрузка множества объектов в память во время выполнения, что замедляет работу первого из двух методов?Если бы кто-нибудь мог помочь мне пролить свет на это, я был бы очень благодарен.

1 Ответ

1 голос
/ 29 апреля 2019

Я только что попытался доказать идею равенства Seq и List с помощью некоторого теста производительности с помощью sbt-jmh :

package bmks

import java.util.concurrent.TimeUnit

import org.openjdk.jmh.annotations.{Benchmark, OutputTimeUnit}

@OutputTimeUnit(TimeUnit.MILLISECONDS)
class TestBenchmark {

  @Benchmark
  def seq(): Seq[String] =
    Seq.fill(1234567)("foo")

  @Benchmark
  def list(): Seq[String] =
    List.fill(1234567)("foo")
}

запустить его с:

$ sbt 
$ sbt:benchmarks> jmh:run -i 20 -wi 10 -f1 -t1

и получил:

sbt:benchmarks> jmh:run -i 20 -wi 10 -f1 -t1
[info] Compiling 1 Scala source to /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/classes ...
[info] Done compiling.
[info] Packaging /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/benchmarks_2.12-1.0.jar ...
Processing 1 classes from /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/classes with "reflection" generator
Writing out Java source to /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/src_managed/jmh and resources to /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/resource_managed/jmh
[info] Done packaging.
[info] Compiling 6 Java sources to /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/classes ...
[info] Done compiling.
[info] Packaging /Volumes/AuroraHD/DEV/scala/benchmarks/target/scala-2.12/benchmarks_2.12-1.0-jmh.jar ...
[info] Done packaging.
[info] Running (fork) org.openjdk.jmh.Main -i 20 -wi 10 -f1 -t1
[info] # JMH version: 1.21
[info] # VM version: JDK 1.8.0_161, Java HotSpot(TM) 64-Bit Server VM, 25.161-b12
[info] # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre/bin/java
[info] # VM options: <none>
[info] # Warmup: 10 iterations, 10 s each
[info] # Measurement: 20 iterations, 10 s each
[info] # Timeout: 10 min per iteration
[info] # Threads: 1 thread, will synchronize iterations
[info] # Benchmark mode: Throughput, ops/time
[info] # Benchmark: bmks.TestBenchmark.list
[info] # Run progress: 0.00% complete, ETA 00:10:00
[info] # Fork: 1 of 1
[info] # Warmup Iteration   1: 0.091 ops/ms
[info] # Warmup Iteration   2: 0.111 ops/ms
[info] # Warmup Iteration   3: 0.111 ops/ms
[info] # Warmup Iteration   4: 0.113 ops/ms
[info] # Warmup Iteration   5: 0.112 ops/ms
[info] # Warmup Iteration   6: 0.115 ops/ms
[info] # Warmup Iteration   7: 0.114 ops/ms
[info] # Warmup Iteration   8: 0.116 ops/ms
[info] # Warmup Iteration   9: 0.115 ops/ms
[info] # Warmup Iteration  10: 0.115 ops/ms
[info] Iteration   1: 0.115 ops/ms
[info] Iteration   2: 0.116 ops/ms
[info] Iteration   3: 0.114 ops/ms
[info] Iteration   4: 0.114 ops/ms
[info] Iteration   5: 0.115 ops/ms
[info] Iteration   6: 0.114 ops/ms
[info] Iteration   7: 0.116 ops/ms
[info] Iteration   8: 0.115 ops/ms
[info] Iteration   9: 0.115 ops/ms
[info] Iteration  10: 0.115 ops/ms
[info] Iteration  11: 0.115 ops/ms
[info] Iteration  12: 0.115 ops/ms
[info] Iteration  13: 0.114 ops/ms
[info] Iteration  14: 0.116 ops/ms
[info] Iteration  15: 0.115 ops/ms
[info] Iteration  16: 0.115 ops/ms
[info] Iteration  17: 0.115 ops/ms
[info] Iteration  18: 0.114 ops/ms
[info] Iteration  19: 0.114 ops/ms
[info] Iteration  20: 0.117 ops/ms
[info] Result "bmks.TestBenchmark.list":
[info]   0.115 ±(99.9%) 0.001 ops/ms [Average]
[info]   (min, avg, max) = (0.114, 0.115, 0.117), stdev = 0.001
[info]   CI (99.9%): [0.114, 0.116] (assumes normal distribution)
[info] # JMH version: 1.21
[info] # VM version: JDK 1.8.0_161, Java HotSpot(TM) 64-Bit Server VM, 25.161-b12
[info] # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre/bin/java
[info] # VM options: <none>
[info] # Warmup: 10 iterations, 10 s each
[info] # Measurement: 20 iterations, 10 s each
[info] # Timeout: 10 min per iteration
[info] # Threads: 1 thread, will synchronize iterations
[info] # Benchmark mode: Throughput, ops/time
[info] # Benchmark: bmks.TestBenchmark.seq
[info] # Run progress: 50.00% complete, ETA 00:05:01
[info] # Fork: 1 of 1
[info] # Warmup Iteration   1: 0.094 ops/ms
[info] # Warmup Iteration   2: 0.115 ops/ms
[info] # Warmup Iteration   3: 0.118 ops/ms
[info] # Warmup Iteration   4: 0.115 ops/ms
[info] # Warmup Iteration   5: 0.114 ops/ms
[info] # Warmup Iteration   6: 0.115 ops/ms
[info] # Warmup Iteration   7: 0.115 ops/ms
[info] # Warmup Iteration   8: 0.115 ops/ms
[info] # Warmup Iteration   9: 0.114 ops/ms
[info] # Warmup Iteration  10: 0.117 ops/ms
[info] Iteration   1: 0.116 ops/ms
[info] Iteration   2: 0.116 ops/ms
[info] Iteration   3: 0.089 ops/ms
[info] Iteration   4: 0.116 ops/ms
[info] Iteration   5: 0.116 ops/ms
[info] Iteration   6: 0.118 ops/ms
[info] Iteration   7: 0.116 ops/ms
[info] Iteration   8: 0.118 ops/ms
[info] Iteration   9: 0.118 ops/ms
[info] Iteration  10: 0.117 ops/ms
[info] Iteration  11: 0.117 ops/ms
[info] Iteration  12: 0.107 ops/ms
[info] Iteration  13: 0.111 ops/ms
[info] Iteration  14: 0.113 ops/ms
[info] Iteration  15: 0.113 ops/ms
[info] Iteration  16: 0.114 ops/ms
[info] Iteration  17: 0.114 ops/ms
[info] Iteration  18: 0.114 ops/ms
[info] Iteration  19: 0.114 ops/ms
[info] Iteration  20: 0.114 ops/ms
[info] Result "bmks.TestBenchmark.seq":
[info]   0.114 ±(99.9%) 0.005 ops/ms [Average]
[info]   (min, avg, max) = (0.089, 0.114, 0.118), stdev = 0.006
[info]   CI (99.9%): [0.108, 0.119] (assumes normal distribution)
[info] # Run complete. Total time: 00:10:02
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark            Mode  Cnt  Score   Error   Units
[info] TestBenchmark.list  thrpt   20  0.115 ± 0.001  ops/ms
[info] TestBenchmark.seq   thrpt   20  0.114 ± 0.005  ops/ms
[success] Total time: 607 s, completed Apr 29, 2019 8:35:22 PM

Вывод: они равны.

...