例如,我们有WordCountServicewithcountWords方法:
class WordCountService { def countWords(url: String): Map[String, Int] = { val sparkConf = new SparkConf().setMaster("spark://somehost:7077").setAppName("WordCount")) val sc = new SparkContext(sparkConf) val textFile = sc.textFile(url) textFile.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _).collect().toMap } }
这项服务看起来很丑陋,不适合单元测试。应该将SparkContext注入此服务。可以使用您最喜欢的DI框架来实现,但为简单起见,它将使用构造函数实现:
class WordCountService(val sc: SparkContext) { def countWords(url: String): Map[String, Int] = { val textFile = sc.textFile(url) textFile.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ + _).collect().toMap } }
现在我们可以创建简单的JUnit测试并将可测试的sparkContext注入WordCountService:
class WordCountServiceTest { val sparkConf = new SparkConf().setMaster("local[*]").setAppName("WordCountTest") val testContext = new SparkContext(sparkConf) val wordCountService = new WordCountService(testContext) @Test def countWordsTest() { val testFilePath = "file://my-test-file.txt" val counts = wordCountService.countWords(testFilePath) Assert.assertEquals(counts("dog"), 121) Assert.assertEquals(counts("cat"), 191) } }