Hostname: page-component-6d856f89d9-xkcpr Total loading time: 0 Render date: 2024-07-16T03:54:14.245Z Has data issue: false hasContentIssue false

Analyzing and interpreting neural networks for NLP: A report on the first BlackboxNLP workshop

Published online by Cambridge University Press:  31 July 2019

Afra Alishahi*
Affiliation:
Department of Cognitive Science and Artificial Intelligence, Tilburg University, The Netherlands
Grzegorz Chrupała
Affiliation:
Department of Cognitive Science and Artificial Intelligence, Tilburg University, The Netherlands
Tal Linzen
Affiliation:
Department of Cognitive Science, Johns Hopkins UniversityBaltimore, United States
*
*Corresponding author. Email: A.Alishahi@uvt.nl

Abstract

The Empirical Methods in Natural Language Processing (EMNLP) 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category.

Type
Article
Copyright
© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Adi, Y., Kermany, E., Belinkov, Y., Lavi, O. and Goldberg, Y. (2017). Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.Google Scholar
Alishahi, A., Barking, M. and Chrupała, G. (2017). Encoding of phonology in a recurrent neural model of grounded speech. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 368378.CrossRefGoogle Scholar
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R. and Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One 10(7), e0130140.CrossRefGoogle ScholarPubMed
Bacon, G. and Regier, T. (2018). Probing sentence embeddings for structure-dependent tense. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 334336.CrossRefGoogle Scholar
Bastings, J., Baroni, M., Weston, J., Cho, K. and Kiela, D. (2018). Jump to better conclusions: Scan both left and right. In Proceedings of EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 4755.CrossRefGoogle Scholar
Burns, K., Nematzadeh, A., Grant, E., Gopnik, A. and Griffiths, T. (2018). Exploiting attention to reveal shortcomings in memory models. In Proceedings of EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 378380.CrossRefGoogle Scholar
Camacho-Collados, J. and Pilehvar, M.T. (2018). On the role of text preprocessing in neural network architectures: An evaluation study on text categorization and sentiment analysis. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 4046.CrossRefGoogle Scholar
Campello, R.J., Moulavi, D. and Sander, J. (2013). Density-based clustering based on hierarchical density estimates. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 17th Pacific-Asia Conference, PAKDD 2013. April 14–17, Gold Coast, Australia: Springer. Proceedings, Part II, pp. 160172.Google Scholar
Chomsky, N. (1962). Context-free grammars and pushdown storage. Report No. 65, RLE, M.I.T. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics. pp. 112118.Google Scholar
Chrupała, G., Kádár, À. and Alishahi, A. (2015). Learning language through pictures. In Proceedings of ACL/IJCNLP (Short Papers).Google Scholar
Conneau, A., Kruszewski, G., Lample, G., Barrault, L. and Baroni, M. (2018). What you can cram into a single vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. pp. 21262136.CrossRefGoogle Scholar
Croce, D., Rossini, D. and Basili, R. (2018). Explaining non-linear classifier decisions within kernel-based deep architectures. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 1624.CrossRefGoogle Scholar
Croce, D., Filice, S., Castellucci, G. and Basili, R. (2017). Deep learning in semantic kernel spaces. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 345354.Google Scholar
Dhar, P. and Bisazza, A. (2018). Does syntactic knowledge in multilingual language models transfer across languages? In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 374377.CrossRefGoogle Scholar
Elman, J.L. (1990). Finding structure in time. Cognitive Science 14(2), 179211.CrossRefGoogle Scholar
Gevrey, M., Dimopoulos, I. and Lek, S. (2003). Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling 160(3), 249264.CrossRefGoogle Scholar
Giannakidou, A. (2011). Negative and positive polarity items: Variation, licensing, and compositionality. In Maienborn, C., von Heusinger, K., and Portner, P. (Eds.), Semantics: An International Handbook of Natural Language Meaning. Berlin: Mouton de Gruyter. pp. 16601712.Google Scholar
Giulianelli, M., Harding, J., Mohnert, F., Hupkes, D. and Zuidema, W. (2018). Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 240248CrossRefGoogle Scholar
Grefenstette, E., Hermann, K.M., Suleyman, M. and Blunsom, P. (2015). Learning to transduce with unbounded memory. In Advances in Neural Information Processing Systems 28, pp. 18281836.Google Scholar
Gulordava, K., Bojanowski, P., Grave, E., Linzen, T. and Baroni, M. (2018). Colorless green recurrent networks dream hierarchically. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Volume 1, pp. 11951205.Google Scholar
Gupta, P. and Schütze, H. (2018). Lisa: Explaining recurrent neural network judgments via layer-wise semantic accumulation and example to pattern transformation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 154164.CrossRefGoogle Scholar
Hao, Y., Merrill, W., Angluin, D., Frank, R., Amsel, N., Benz, A. and Mendelsohn, S. (2018). Context-free transductions with neural stacks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 306315.CrossRefGoogle Scholar
Harbecke, D., Schwarzenberg, R. and Alt, C. (2018). Learning explanations from language data. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 316318.CrossRefGoogle Scholar
Hendrickx, I., Kim, S.N., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L. and Szpakowicz, S. (2009). Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions. Association for Computational Linguistics, pp. 9499CrossRefGoogle Scholar
Hiebert, A., Peterson, C., Fyshe, A. and Mehta, N. (2018). Interpreting word-level hidden state behaviour of character-level lstm language models. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 258266.CrossRefGoogle Scholar
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation 9(8), 17351780.CrossRefGoogle ScholarPubMed
Htut, P.M., Cho, K. and Bowman, S. (2018). Grammar induction with neural language models: An unusual replication. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 371373.CrossRefGoogle Scholar
Hupkes, D., Veldhoen, S. and Zuidema, W. (2018). Visualisation and ’diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure. JAIR 61, 907926.CrossRefGoogle Scholar
Jacovi, A., Sar Shalom, O. and Goldberg, Y. (2018). Understanding convolutional neural networks for text classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 5665.CrossRefGoogle Scholar
Jang, E., Gu, S. and Poole, B. (2017). Categorical reparameterization with gumbel-softmax. In 5th International Conference on Learning Representations, ICLR 2017, April 24–26, Toulon, France. Conference Track Proceedings.Google Scholar
Jumelet, J. and Hupkes, D. (2018). Do language models understand anything? on the ability of lstms to understand negative polarity items. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 222231CrossRefGoogle Scholar
Kádár, A., Chrupała, G. and Alishahi, A. (2017). Representation of linguistic form and function in recurrent neural networks. CL 43(4), 761780.Google Scholar
Kementchedjhieva, Y. and Lopez, A. (2018). ‘Indicatements’ that character language models learn English morpho-syntactic units and regularities. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 145153.CrossRefGoogle Scholar
Kindermans, P.-J., Schütt, K.T., Alber, M., Müller, K.-R., Erhan, D., Kim, B. and Dähne, S. (2017). Learning how to explain neural networks: Patternnet and patternattribution. 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, Conference Track Proceedings.Google Scholar
Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A. and Fidler, S. (2015). Skip-thought vectors. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, pp. 32943302.Google Scholar
Krug, A. and Stober, S. (2018). Introspection for convolutional automatic speech recognition. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 187199.CrossRefGoogle Scholar
Lake, B.M. and Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, pp. 28792888.Google Scholar
Linzen, T. (2019). What can linguistics and deep learning contribute to each other? response to pater. Language 95(1), e98e108.CrossRefGoogle Scholar
Linzen, T., Dupoux, E. and Goldberg, Y. (2016). Assessing the ability of lstms to learn syntax-sensitive dependencies. TACL 4(1), 521535.Google Scholar
Loula, J., Baroni, M. and Lake, B. (2018). Rearranging the familiar: Testing compositional generalization in recurrent networks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 108114.CrossRefGoogle Scholar
Madhyastha, P.S., Wang, J. and Specia, L. (2018a). End-to-end image captioning exploits distributional similarity in multimodal space. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 381383.CrossRefGoogle Scholar
Madhyastha, P.S., Wang, J. and Specia, L. (2018b). End-to-end image captioning exploits distributional similarity in multimodal space. In British Machine Vision Conference 2018, BMVC 2018, September 3–6, Northumbria University, Newcastle, UK. pp. 306.Google Scholar
Mareček, D. and Rosa, R. (2018). Extracting syntactic trees from transformer encoder self-attentions. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 347349.CrossRefGoogle Scholar
Montavon, G., Lapuschkin, S., Binder, A., Samek, W. and Müller, K.-R. (2017). Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recognition 65, 211222.CrossRefGoogle Scholar
Nematzadeh, A., Burns, K., Grant, E., Gopnik, A. and Griffiths, T.L. (2018). Evaluating theory of mind in question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, October 31–November 4, Brussels, Belgium, pp. 23922400.CrossRefGoogle Scholar
Niculae, V. and Blondel, M. (2017). A regularized framework for sparse and structured neural attention. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, July 10–15, Long Beach, CA, USA, pp. 33403350.Google Scholar
Niculae, V., Martins, A.F., Blondel, M. and Cardie, C. (2018). Sparsemap: Differentiable sparse structured inference. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden. pp. 37963805.Google Scholar
Paperno, D. (2018). Limitations in learning an interpreted language with recurrent models. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 384386.CrossRefGoogle Scholar
Papernot, N. and McDaniel, P. (2018). Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. arXiv:1803.04765.Google Scholar
Peters, B., Niculae, V. and Martins, A.F.T. (2018). Interpretable structure induction via sparse attention. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 365367.CrossRefGoogle Scholar
Poerner, N., Roth, B. and Schütze, H. (2018). Interpretable textual neuron representations for NLP. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 325327.CrossRefGoogle Scholar
Poliak, A., Haldar, A., Rudinger, R., Hu, J.E., Pavlick, E., White, A.S. and Van Durme, B. (2018). Collecting diverse natural language inference problems for sentence representation evaluation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 337340.CrossRefGoogle Scholar
Raganato, A. and Tiedemann, J. (2018). An analysis of encoder representations in transformer-based machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 287297.CrossRefGoogle Scholar
Raghu, M., Gilmer, J., Yosinski, J. and Sohl-Dickstein, J. (2017). SVCCA: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. Long Beach, CA, USA, pp. 60786087.Google Scholar
Ravfogel, S., Goldberg, Y. and Tyers, F. (2018). Can LSTM learn to capture agreement? The case of Basque. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 98107.CrossRefGoogle Scholar
Rønning, O., Hardt, D. and Søgaard, A. (2018). Linguistic representations in multi-task neural networks for ellipsis resolution. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, pp. 6673.CrossRefGoogle Scholar
Ross, J.R. (1967). Constraints on Variables in Syntax. PhD Thesis, MIT.Google Scholar
Saphra, N. and Lopez, A. (2018a). Language models learn POS first. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 328330.CrossRefGoogle Scholar
Saphra, N. and Lopez, A. (2018b). Understanding learning dynamics of language models with SVCCA. arXiv:1811.00225.CrossRefGoogle Scholar
Sennhauser, L. and Berwick, R. (2018). Evaluating the ability of lstms to learn context-free grammars. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 115124.CrossRefGoogle Scholar
Shen, Y., Lin, Z., Huang, C.-W. and Courville, A. (2018). Neural language modeling by jointly learning syntax and lexicon. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.Google Scholar
Siegelmann, H.T. and Sontag, E.D. (1995). On the computational power of neural nets. Journal of Computer and System Sciences 50(1), 132150.CrossRefGoogle Scholar
Skachkova, N., Trost, T. and Klakow, D. (2018). Closing brackets with recurrent neural networks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 232239.CrossRefGoogle Scholar
Søgaard, A., de Lhoneux, M. and Augenstein, I. (2018). Nightmare at test time: How punctuation prevents parsers from generalizing. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, pp. 2529.CrossRefGoogle Scholar
Sommerauer, P. and Fokkens, A. (2018). Firearms and tigers are dangerous, kitchen knives and zebras are not: Testing whether word embeddings can tell. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 276286.CrossRefGoogle Scholar
Spinks, G. and Moens, M.-F. (2018). Evaluating textual representations through image generation. In Proceedings of EMNLP BlackboxNLP. Association for Computational Linguistics.Google Scholar
Stahlberg, F., Saunders, D. and Byrne, B. (2018). An operation sequence model for explainable neural machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 175186.CrossRefGoogle Scholar
Sushil, M., Suster, S. and Daelemans, W. (2018). Rule induction for global explanation of trained models. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 8297.CrossRefGoogle Scholar
Trifonov, V., Ganea, O.-E., Potapenko, A. and Hofmann, T. (2018). Learning and evaluating sparse interpretable sentence embeddings. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 200210.CrossRefGoogle Scholar
Tutek, M. and Šnajder, J. (2018). Iterative recursive attention model for interpretable sequence classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 249257.CrossRefGoogle Scholar
Vania, C. and Lopez, A. (2018). Explicitly modeling case improves neural dependency parsing. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 356358.CrossRefGoogle Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. Long Beach, CA, USA, pp. 60006010.Google Scholar
Verwimp, L., Van Hamme, H., Renkens, V. and Wambacq, P. (2018a). State gradients for RNN memory analysis. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 344346.CrossRefGoogle Scholar
Verwimp, L., Van Hamme, H., Renkens, V. and Wambacq, P. (2018b). State gradients for RNN memory analysis. Interspeech 2018.CrossRefGoogle Scholar
Vu, N.T., Adel, H., Gupta, P. et al. (2016). Combining recurrent and convolutional neural networks for relation classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. pp. 534539.Google Scholar
Wallace, E., Feng, S. and Boyd-Graber, J. (2018). Interpreting neural networks with nearest neighbors. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 136144.CrossRefGoogle Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S. (2018). Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 353355.CrossRefGoogle Scholar
Wei, J., Pham, K., O’Connor, B. and Dillon, B. (2018). Evaluating grammaticality in seq2seq models with a broad coverage HPSG grammar: A case study on machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 298305.CrossRefGoogle Scholar
Weiss, G., Goldberg, Y. and Yahav, E. 2018. On the practical computational power of finite precision RNNs for language recognition. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics. pp. 740745.CrossRefGoogle Scholar
Wilcox, E., Levy, R., Morita, T. and Futrell, R. (2018). What do RNN language models learn about filler–gap dependencies? In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 211221.CrossRefGoogle Scholar
Williams, A., Drozdov, A. and Bowman, S.R. (2018). Do latent tree learning models identify meaningful structure in sentences? Transactions of the Association for Computational Linguistics 6, 253267.CrossRefGoogle Scholar
Yin, P., Zhou, C., He, J. and Neubig, G. (2018). StructVAE: Tree-structured latent variable models for semi-supervised semantic parsing. In Proceedings The 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia.CrossRefGoogle Scholar
Zhang, K. and Bowman, S. (2018a). Language modeling teaches you more syntax than translation does: Lessons learned through auxiliary task analysis. arXiv:1809.10040.CrossRefGoogle Scholar
Zhang, K. and Bowman, S. (2018b). Language modeling teaches you more than translation does: Lessons learned through auxiliary syntactic task analysis. In Proceedings of EMNLP BlackboxNLP. Association for Computational Linguistics. pp. 359361.Google Scholar
Zhou, C. and Neubig, G. (2017). Multi-space variational encoder-decoders for semi-supervised labeled sequence transduction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational LinguisticsGoogle Scholar